Senior iOS Software Engineer

Cologne, Germany

Abstract glass objects arranged in a sorted sequence, evoking Swift's sorting algorithms

Sorting in Swift: From Introsort to Timsort

TL;DR

  • Swift’s standard library used Introsort until Swift 5.
  • Swift 5 replaced it with Timsort, an adaptive, stable hybrid built on natural runs in the input.
  • Swift’s Timsort is a deliberately simplified variant of the original.
  • Looking into Swift’s open source code is often more accessible and rewarding than it first sounds.

I recently had a technical interview for an iOS position. I had assumed it would be heavily iOS-focused, but the questions turned out to be much more general. One of them was how I would sort a list of 100,000 people.

My answer was straightforward. Read the file, let the system libraries sort it, write the file back. Then came the follow-up.

How would you sort a trillion names?

I did a quick back-of-the-envelope estimate of the order of magnitude. At one byte per record we would be in the range of a terabyte. Assuming a name takes around 32 bytes on average, ignoring record overhead, encoding details, and sort keys, we would be looking at around 32 terabytes. That does not fit on most disks, and it certainly does not fit in memory. At that point I would assume we are dealing with a big data system. The first names that came to mind were Hadoop1, Cassandra2, and Google Bigtable3.

Then I was asked how something like that would actually work. I admitted that as a client developer I don’t usually touch backend systems of that scale. I have no experience with Google Bigtable or Cassandra. But intuitively, based on what I know about Hadoop from reading about it out of curiosity, you split the data into chunks, distribute them across workers, and then merge the results. Hadoop is famously known for MapReduce. The next question was which algorithm would be used. Merge sort came to mind first, because it parallelizes naturally across distributed workers.

That was essentially the end of the line of questioning. But it left me thinking. Years ago, when Swift was open-sourced, I had poked around the source code, including the sort implementation. I mentioned during the interview that I remembered a threshold at which the algorithm switched, somewhere around 20 elements, switching to merge sort.
Later I went back to check, and my memory turned out to be wrong.

When I read the Swift open source code back when it was first published, Swift was using Introsort. Today it uses Timsort.

In this post, I want to walk through what that change actually means. What Introsort and Timsort are, why the Swift team made the switch, and what the implementation looks like today.

A quick refresher on the building blocks

Both Introsort and Timsort are hybrids that lean on simpler classical algorithms. If those are familiar, skip the box below; otherwise it’s a quick refresher of what gets glued together later.

Insertion Sort

Walks through the array from left to right and slides each new element into its place inside the already-sorted prefix. Unbeatable on small or nearly-sorted arrays, but scales badly past a few dozen elements.

Merge Sort

Splits the array in half, sorts each half recursively, and then merges the two sorted halves into one. Stable and predictable, but needs extra memory for the merge.

Quick Sort

Picks a pivot, partitions the array into “smaller than pivot” and “larger than pivot”, then recurses on each side. In practice often the fastest comparison sort, but a bad pivot can drag it down, and it is not stable.

Heap Sort

Builds a max-heap from the array, then repeatedly swaps the root with the last element and shrinks the heap. Predictable and uses no extra memory, but its scattered access pattern hurts cache locality, and it is not stable.

AlgorithmBestAverageWorstMemoryStable
Insertion SortO(n)O(n)O(n2)O(n^2)O(n2)O(n^2)O(1)O(1)yes
Merge SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n)O(n)yes
Quick SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n2)O(n^2)O(logn)O(\log n)no
Heap SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(1)O(1)no

Introsort: what Swift used to do

When Swift was first open-sourced, the standard library used Introsort, a hybrid of three of the four building blocks.

Introsort=Quick Sort+Heap Sort+Insertion Sort\text{Introsort} = \text{Quick Sort} + \text{Heap Sort} + \text{Insertion Sort}

The trick is that every time Introsort enters a sub-range, it decides which of the three is best suited for that sub-range.

The decision logic in _introSortImpl boils down to this4:

sort(range):
  if count(range) < 20:           insertionSort(range)
  else if depthLimit == 0:        heapSort(range)
  else:                           pivot = partition(range)
                                  sort(left of pivot,  depthLimit - 1)
                                  sort(right of pivot, depthLimit - 1)

The depth limit is set once at the start, scaled to the input size.

The reasoning behind the mix is straightforward. Quick Sort is the default because it is fast in practice and works in-place, but it can degenerate to O(n2)O(n^2) on bad pivots. The depth limit is a tripwire: if the recursion gets too deep, that’s a strong signal that pivots are going wrong, so Heap Sort takes over for the current sub-range with its guaranteed O(nlogn)O(n \log n). Below twenty elements, Insertion Sort wins on overhead and cache locality alone.

The decision is made fresh for every sub-range, not just once at the start. So a single sort run can use all three algorithms side by side: some sub-ranges end up in Heap Sort, others stay with Quick Sort, and small leftovers go to Insertion Sort.

The combination gives Introsort a guaranteed O(nlogn)O(n \log n) worst case while staying as fast as Quick Sort on the common case. What it does not give you is stability. Neither Quick Sort nor Heap Sort preserves the relative order of equal elements. And it is not adaptive either: a fully sorted input still goes through the full O(nlogn)O(n \log n) machinery.

That last part matters, because real-world data is rarely random. And it is exactly where Timsort comes in.

The Swift 5 stability promise

When Swift 5 shipped, the standard library quietly swapped Introsort out for Timsort5. The change had been debated on the Swift Forums for a while6, and two motivations drove it.

The first was stability as an API guarantee. To be clear, this is sort stability, not ABI stability. The two are unrelated concepts that happened to land in the same release: ABI stability means a binary compiled against one Swift version keeps working with newer standard library versions, while sort stability is the property that equal elements keep their input order after sorting.

Before Swift 5, sorted() happened to be stable in many cases simply because Introsort’s Insertion Sort branch handled small partitions stably, but the standard library made no promise. Equal elements could be reordered by implementation details, and code was not allowed to depend on their relative order staying the same across Swift versions or algorithm changes. With Swift 5 the doc comment for sort and sorted was updated to spell it out:

The sorting algorithm is guaranteed to be stable. A stable sort preserves the relative order of elements for which areInIncreasingOrder does not establish an order.7

What does “stable” mean concretely? Take a list of people sorted by age:

let people = [("Anna", 30), ("Bob", 25), ("Clara", 30)]
let sorted = people.sorted { $0.1 < $1.1 }

A stable sort guarantees [Bob, Anna, Clara]. Anna stays before Clara because she came first in the input. An unstable sort might give you [Bob, Clara, Anna], which is also correctly sorted by age, but the relative order of the two 30-year-olds is now arbitrary. That matters whenever you sort by multiple criteria in succession, or when the input order itself carries meaning.

The second motivation was better performance on real-world data. Real inputs are almost never random: log entries arrive ordered by time, lists from database queries come pre-sorted by some index, user-edited data tends to be mostly-sorted with a few changes on top. Introsort treats all of that the same as random data and pays the full O(nlogn)O(n \log n) price. Timsort, by contrast, is adaptive: it detects existing order in the input and uses it, dropping all the way to O(n)O(n) when the input is already sorted.

Stability and adaptivity are exactly the two things Introsort lacks. So the switch was less about replacing one algorithm with a faster one, and more about replacing it with one whose guarantees match how Swift code actually uses sorting.

Timsort

Timsort was invented by Tim Peters for Python89. Like Introsort, it is a hybrid, but with a different mix and a different strategy. Where Introsort is built around dividing the array, Timsort is built around finding the order that is already there.

Timsort=Merge Sort+Binary Insertion Sort+Binary Search+Galloping Search\text{Timsort} = \text{Merge Sort} + \text{Binary Insertion Sort} + \text{Binary Search} + \text{Galloping Search}

Three of those four ingredients are variations on Insertion Sort and search. Binary Insertion Sort is the classic Insertion Sort with one tweak: instead of walking backwards element by element to find the insertion position, it uses binary search. Binary Search and Galloping Search play similar roles inside the merge step, finding the right position quickly when the data is well-behaved. They are all optimisations on top of the base idea, and we’ll come back to them when we look at what Swift kept and what it dropped.

The core idea of Timsort itself is simple: real data is rarely random, so instead of dividing the array into halves like classical Merge Sort does, look for natural runs that are already monotonic, and merge those. Whatever order is already in the input becomes part of the algorithm’s strategy instead of something to overcome. Insertion Sort handles the very small cases and pads short runs to a useful length.

A run is a contiguous subsequence that is either non-decreasing (a0a1a2a_0 \leq a_1 \leq a_2 \leq \dots) or strictly decreasing (a0>a1>a2>a_0 > a_1 > a_2 > \dots). A descending run gets reversed in place so everything downstream can assume ascending runs.

The main loop looks roughly like this:

stableSort(elements):
  if count(elements) <= minimumRunLength:
    insertionSort(elements)
    return

  while not at end:
    run = findNextRun()
    if run is descending: reverse(run)
    if length(run) < minimumRunLength:
      extend run with insertionSort up to minimumRunLength
    push run onto stack
    while stackInvariantsViolated:
      mergeAdjacentRuns()

  while stack has more than one run:
    mergeAdjacentRuns()

For very small inputs Timsort skips the whole machinery and just calls Insertion Sort. For everything else, three details are worth pausing on: how runs are defined, how their minimum length is chosen, and how the merge order is controlled.

Strict descending runs

Timsort treats ascending and descending runs differently. Ascending runs are used as-is. Descending runs are reversed in place, so that everything from then on is ascending and the merge step only has to deal with one direction. That reversal step is where equal elements become a problem.

Take an input like [3, 3, 1]. To see what happens to the two threes, label them 3ₐ and 3ᵦ based on their input order: [3ₐ, 3ᵦ, 1]. If Timsort treated this as a descending run and reversed it, the result would be [1, 3ᵦ, 3ₐ]. The two threes have now swapped places relative to each other. That is exactly what a stable sort is not allowed to do. The whole point is that equal elements keep their input order.

So Timsort needs a rule that prevents this. The rule is asymmetric:

  • Ascending runs are allowed to contain equal neighbors. [3ₐ, 3ᵦ, 5] counts as one ascending run, and since ascending runs are never reversed, 3ₐ stays before 3ᵦ.
  • Descending runs must be strictly going down. [3ₐ, 3ᵦ, 1] does not count as one descending run, because 3ₐ is not greater than 3ᵦ. So Timsort takes the first two threes as a tiny ascending run and starts a new run at 1. Neither piece gets reversed, and the threes keep their input order.

With that rule in place, any time Timsort sees equal neighbors, they end up inside an ascending run by definition. And since ascending runs are never reversed, the equal elements never have a chance to swap positions. Stability is preserved automatically, before any merging starts.

minimumRunLength

Natural runs are nice, but if every run is only two or three elements long, Timsort ends up doing a huge number of tiny merges. So when a found run is too short, Insertion Sort pads it up to a minimum length first. The question is: what minimum?

The choice is a balance. Too short and Timsort ends up doing many tiny merges. Too long and Insertion Sort, which scales badly, becomes the bottleneck. In practice the sweet spot lies between 32 and 64. Timsort picks a value in that range with a small bit trick on the input size8. For inputs smaller than 64 there is nothing to balance, and Insertion Sort handles everything directly.

The 2015 bug

Runs sit on a stack until Timsort decides which pair to merge next. The rules that govern that decision had a hole: in 2015, Stijn de Gouw and colleagues proved with the KeY tool10 that the original check was too narrow and could overflow the preallocated stack on pathological inputs11. The fix is to look one level deeper into the stack. Swift and current OpenJDK-style implementations ship that fix.

Where else Timsort is used

Timsort is the default or built-in sort in:

  • Python (CPython, since 2.3 in 2003)129
  • Java SE 7 and later, for object arrays (since 2011)13
  • Android, inherited from the Java standard library14
  • V8 (Chrome, Node.js, Edge), since V8 v7.0 / Chrome 70 (September 2018)15
  • GNU Octave, with a source comment crediting Python’s listobject.c16
  • Swift, since Swift 5 (March 2019)

There is more to Timsort than fits in a single article. If you got curious, the sources in the footnotes are good starting points.

Inside Swift’s Timsort

Swift’s Timsort is recognisably the same algorithm. Same run definition, same minimumRunLength bit trick, same four-run stack-invariant check. What it leaves out are the three search-based optimisations from the original formula: Galloping Search, Pre-Merge Binary Search, and Binary Insertion Sort. The implementation is documented in _stableSortImpl17, and its doc comment is explicit about the simplification:

The adaptive algorithm used is Timsort, modified to perform a straight merge of the elements using a temporary buffer.17

In other words: Timsort without its search-based merge tricks, with the merge step running over a temporary buffer instead of in place.

Timsort (Swift)=Merge Sort+Insertion Sort\text{Timsort (Swift)} = \text{Merge Sort} + \text{Insertion Sort}

No search-based merge optimisations

The original Timsort uses two tricks to make merges cheaper. Before merging two adjacent runs, a binary search finds where the second run’s first element would land in the first run, so the prefix that is already final can be skipped. And once one run keeps “winning” the comparison for seven rounds in a row, Timsort switches into a galloping mode that uses exponential plus binary search to move whole chunks of the winning run at once. Swift’s _merge does neither and always merges the full runs one pair at a time.

Linear Insertion Sort instead of binary

For padding short runs up to minimumRunLength, the original uses Binary Insertion Sort: binary search for the insertion position, then shift. Swift’s _insertionSort is the classic linear version with backwards swaps. At the run lengths involved (32 to 64 elements) the difference is small in practice.

Why this is worth knowing

As application developers, we usually will not write sort algorithms ourselves. The standard library does that, and we just call sorted(). But understanding what runs underneath is still useful, and looking at the Swift source code is a practical way to do it.

What the implementation makes clear is that the theoretical optimum is not the only thing that matters. The shape of real-world data matters too.

The broader lesson is the same as for documentation: it pays off to actually read and understand it.

Footnotes

  1. Hadoop is an open-source framework that stores and processes very large datasets across many machines. See hadoop.apache.org.

  2. Cassandra is an open-source database that spreads data across many machines and is built to handle a high volume of reads and writes. See cassandra.apache.org.

  3. Google Bigtable is a database developed by Google for storing very large tables across many machines, and it inspired the design of databases like HBase and Cassandra. See cloud.google.com/bigtable.

  4. Swift 4.1’s Introsort implementation lives in Sort.swift.gyb on the swift-4.1-RELEASE tag. The .gyb suffix is a Python-based templating system Apple uses to generate boilerplate; here it produced two function variants (one with a predicate closure, one hardcoded to <) from a single source.

  5. Swift PR #19717, merged in November 2018 and shipped with Swift 5 (March 2019), replaced the Introsort implementation with Timsort.

  6. Swift Forums thread “Add stable sort to stdlib” (thread 9309) proposed the change before the PR landed.

  7. Doc comment of the sort(by:) overload in Sort.swift, swift-6.3.1-RELEASE.

  8. Tim Peters, “listsort.txt”, the original description of Timsort by its author. Covers the run definition, the minrun bit trick, the merge strategy, and the stability rationale. 2

  9. Python Software Foundation, “Python 2.3.0”, released July 29, 2003. 2

  10. KeY is a deductive verification tool for Java, used to formally prove correctness properties of source code.

  11. Stijn de Gouw, Jurriaan Rot, Frank S. de Boer, Richard Bubel, Reiner Hähnle, “OpenJDK’s java.utils.Collection.sort() is broken: The good, the bad and the worst case,” CAV 2015. The paper proves the bug formally and outlines two fix strategies.

  12. CPython’s listobject.c, the original home of Timsort.

  13. OpenJDK bug JDK-6804124, “Replace ‘modified mergesort’ in java.util.Arrays.sort with timsort”.

  14. Android’s TimSort.java, inherited from the Java standard library.

  15. V8 blog post “Getting things sorted in V8”, September 2018.

  16. GNU Octave’s oct-sort.cc, with a source comment crediting Python’s listobject.c.

  17. Current Swift Timsort implementation: Sort.swift on the swift-6.3.1-RELEASE tag. 2