Collection Data Structures In Swift

Collection Data Structures In Swift
What kind of performance do you want from your data structure?

What kind of performance do you want from your data structure?

Imagine you have an application that needs to work with a lot of data. Where do you put that data? How do you keep it organized and handle it efficiently?

If your program only manages one number, you store it in one variable. If it has two numbers then you’d use two variables.

What if it has 1000 numbers, 10,000 strings or the ultimate library of memes? (Wouldn’t it be nice to be able to find a perfect meme in an instant?) In that case, you’ll need one of the fundamental collection data structures. Fortunately for you, that’s the topic of this tutorial.

Here’s how the tutorial will flow:

  • You’ll review what a data structure is and then you’ll learn about Big-O notation. It’s the standard tool for describing the performance of different data structures.
  • Next up: practicing by measuring the performance of arrays, dictionaries and sets — the most basic data structures available in Cocoa development. This will also double as a rudimentary introduction to performance testing.
  • As you proceed, you’ll compare the performance of mature Cocoa structures with newer, Swift-only counterparts.
  • Finally, you’ll briefly review some related types offered by Cocoa. These are data structures that you might be surprised to learn are already at your fingertips.

Getting Started

Before you dive in and explore the data structures used in iOS, you should review a couple of basic concepts about what they are and how to measure their performance.

There are many types of collection data structures, and each represents a specific way to organize and access a collection of items. One collection type might make some activities especially efficient, such as adding a new item, finding the smallest item or ensuring you’re not adding duplicates.

Without collection data structures, you’d be stuck trying to manage items one by one. A collection allows you to:

  1. Handle all those items as one entity
  2. Imposes some structure on them
  3. Efficiently insert, removing and retrieve items

What is “Big-O” Notation?

Big-O — that’s the letter O, not the number zero — notation is a way of describing the efficiency of an operation on a data structure. There are various kinds of efficiency: you could measure how much memory the data structure consumes, how much time it takes under the worst case scenario, how much time it takes on average and so on and so forth.

In this tutorial, you’ll measure how much time an operation takes, on average.

In general, larger quantities of data don’t make an operation faster. Usually it’s the opposite, but sometimes there is little or no slowdown. Big-O notation is a precise way of describing this. You write an exact functional form that roughly describes how the running time changes based on the number of elements in the structure.

When you see Big-O notation written as O(something-with-n), the n is the number of items in the data structure, and the something-with-n is roughly how long the operation will take.

“Roughly”, ironically enough, has a specific meaning: the behavior of the function at the asymptotic limit of very large n. Imagine n is a really really large number – you’re thinking about how the performance of some operation will change as you go from n to n+1.

Note: This is all a part of algorithmic complexity analysis, and if you wish to explore it deeper then whip out any computer science textbook. There you’ll find mathematical methods for analyzing complexity from scratch, finer distinctions between different kinds of efficiency, more explicit formulations of assumptions about the underlying machine model, and other fun stuff that you may or may not wish you knew.

The most commonly seen Big-O performance measures are as follows, in order from best to worst performance:

  • O(1) (constant time): No matter how many items are in a data structure, this function calls the same number of operations. This is considered ideal performance.
  • O(log n) (logarithmic): The number of operations this function calls grows at the rate of the logarithm of the number of items in the data structure. This is good performance, since it grows considerably slower than the number of items in the data structure.
  • O(n) (linear): The number of operations this function calls will grow linearly with the size of the structure. This is considered decent performance, but it can grind along with larger data collections.
  • O(n (log n)) (“linearitmic”): The number of operations called by this function grow by the logarithm of the number of items in the structure multiplied by the number of items in the structure. Predictably, this is about the lowest level of real-world tolerance for performance. While larger data structures perform more operations, the increase is somewhat reasonable for data structures with small numbers of items.
  • O(n²) (quadratic): The number of operations called by this function grow at a rate that equals the size of the data structure, squared — poor performance at best. It grows quickly enough to become unusably slow even if you’re working with small data structures.
  • O(2^n) (exponential): The number of operations called by this function grow by two to the power of the size of the data structure. This gives very poor performance and becomes intolerably slow almost immediately.
  • O(n!) (factorial): The number of operations called by this function grow by the factorial of the size of the data structure. Essentially, you have the worst case scenario for performance. For example, in a structure with just 100 items, the multiplier of the number of operations is 158 digits long.

Here’s a more visual representation of performance and how it degrades when there are more items in a collection, going from one to 25 items:

Big-O Notation

Did you notice that you can’t even see the green O(log n) line because it is so close to the ideal O(1) at this scale? That’s pretty good! On the other hand, operations that have Big-O notations of O(n!) and O(2^n) degrade so quickly that by the time you have more than 10 items in a collection, the number of operations spikes completely off the chart.

Yikes! As the chart clearly denotes, the more data you handle, the more important it is to choose the right data structure for the job.

Now that you’ve seen how to compare the performance of operations on data structures, it’s time to review the three most common types used in iOS and explore how they perform in theory and in practice.

Common iOS Data Structures

The three most common data structures in iOS are arrays, dictionaries and sets. First consider how they differ in ideal terms, as fundamental abstractions, and then you’ll examine the performance of the actual concrete classes which iOS offers for representing those abstractions.

For arrays and dictionaries, iOS offers multiple concrete classes that work for the same abstraction. In addition to the old Foundation data structures available in Swift and Objective-C, there are new Swift-only versions of data structures that integrate tightly with the language.

Foundation data structures have been around longer and are currently faster to create, even when you’re working with vast quantities of data. Swift data structures, however, can be faster when you need to search through data. Which one to choose depends on the operation you want to perform.



An array is a group of items placed in a specific order, and you can access each item via an index — a number that indicates its position in the order. When you write the index in brackets after the name of the array variable, this is subscripting.

Swift arrays are immutable if you define them as constants with let, and mutable if you define them as variables with var.

In contrast, a Foundation NSArray is immutable by default. If you want to add, remove or modify items after creating the array, you must use the mutable variant class NSMutableArray.

An NSArray is heterogeneous, meaning that it can contain Cocoa objects of different types. Swift arrays are homogeneous, meaning that each Swift Array is guaranteed to contain only one type of object.

However, you can still define a single Swift Array so it stores various types of Cocoa objects by specifying that the one type is AnyObject, since every Cocoa type is also a subtype of this.

Expected Performance and When to Use Arrays

Order matters, and that is the primary reason to use an array to store variables. Think about those times when you sort contacts by First or Last name, a To-Do list by date, or any other situation when it’s critical to find or display data in a specific way.

Apple’s documentation includes three key expectations for Array performance in the CFArray header:

  1. Accessing any value at a particular index in an array is at worst O(log n), but should usually be O(1).
  2. Searching for an object at an unknown index is at worst O(n (log n)), but will generally be O(n).
  3. Inserting or deleting an object is at worst O(n (log n)) but will often be O(1).

These guarantees subtly deviate from the simple “ideal” array that you might expect from a computer science textbook or the C language, where an array is always a sequence of items laid out contiguously in memory.

Consider it a useful reminder to check the documentation!

In practice, these expectations make sense when you think about them:

  1. If you already know where an item is and need to look it up, then it should be very fast.
  2. If you don’t know where a particular item is, you’ll need to look through the array from beginning to end to find it, and your search will be slower.
  3. If you know where you’re adding or removing an object, it’s not too difficult, although you may need to adjust the rest of the array afterwards, and that is more time-consuming.

How well do these expectations align with reality? Keep reading to find out!

Sample App Testing Results

Download the the sample application for this tutorial and open it in Xcode. There are some testing methods that will create and/or test an array, and then show you the how long it took to perform each task.

One thing to note: In the app, the Debug configuration automatically sets optimization to a level equal to the release configuration — this is so when you test the application you get the same level of optimization you’d see in the real world.

With optimization off, the results for Swift are a lot worse. Feel free to test this on your own, but make sure you’ve got a good book to read while the tests run.

You need a minimum of 1000 items to run tests with the sample app. When you build and run, the slider will be set to 1000. Press the Create Array and Test button, and you’ll be testing in no time:

Swift 1000 Items

Drag the slider over ever so slightly to the first number above 1000 you can reasonably hit (which will probably be somewhere around 100,000), and press Create Array and Test again. This will take…quite a bit longer:

Swift Array 100,000

In this case, which ran on an iPhone 5 running iOS 8 GM, for around 106.5 times the number of items, it took roughly 1,417 times as much time to create the array.

To grow at O(n) in this example, you’d expect the array with 106.5 times the number of items to take around 3.7 seconds. To grow at O(n log n), you’d expect it to take around 33.5 seconds. To grow at O(n²), you’d expect it to take about 395 seconds.

So in this example, the Swift Array creation performance drops somewhere between O(n log n) and O(n²). As you can see, this is not usable once you get beyond a few dozen items.

What about NSMutableArray? You can still call Foundation classes from Swift without having to drop back down to Objective-C, and you’ll notice there is also a Swift class called NSArrayManipulator that conforms to the same protocol, ArrayManipulator.

Because of this, you can easily swap in NSMutableArray for a Swift Array. The project code is simple enough that you can try NSMutableArray with a single line change to compare the performance.

Open ArrayViewController.swift file and change line 27 from:

let arrayManipulator: ArrayManipulator = SwiftArrayManipulator()


let arrayManipulator: ArrayManipulator = NSArrayManipulator()

Build and run again, and press Create Array and Test to test the creation of an NSMutableArray with 1000 items:

NSArray 1,000 Items

Wow, it’s much faster to create an array this way! You’re probably thinking, “Okay, this is clearly much better performance than Swift — why on earth does its name imply speed? More importantly, how fast can this go?”

Slide that slider as far to the right as it will go — to 10,000,000 items — and press Create Array and Test again. Creating an array with 10 *million* items with a Foundation structure is more than three times faster than creating an array with around 100,000 in Swift. Yikes!

NSArray 10M items

However, you only create an array once then you perform other operations on it, like finding, adding, or removing objects.

In more exhaustive testing, like when you’re using some of iOS 8’s performance testing methods to call each of these methods 50 times, patterns emerge:

As you’ve seen, creating an array is much, much faster with Foundation classes than it is with Swift. The time creating NSArray objects grows at about an O(n) rate, while Swift arrays grow between O(n) and O(n²).

The fact that Swift arrays start slower and grow faster means they’re really slow when you create large arrays.

Adding and removing objects to and from an array isn’t quite as big of a difference — while Swift arrays are still slower than NSArray, they don’t grow nearly as quickly. Generally, they grow between O(log n) and O(n).

Looking up items within an array is where Swift justifies its name. Lookups up by index and object grow at very close rates for both Swift arrays and NSArray. When you’re looking up by object, large Swift arrays perform 50 times faster than using comparable NSMutableArray objects.

That’s pretty darn fast!

The types in Swift arrays certainly help for lookups by object; these operate without the overhead an array that could contain any object.

On the other hand, NSArray first has to determine the type of each object, then figure out if it’s equal to the object you seek.

Note: The results of running each method 50 times against the Gold Master versions of Xcode 6 and iOS 8 on an iPhone 5 are available in this iPhone 5 Performance Tests spreadsheet. You can also run the tests included with the application to see how they run on a 10-runs-per-test average, and how they run on your phone.



Dictionaries are a way of storing values that don’t need to be in any particular order and are uniquely associated with keys. You use the key to store or look up a value.

Dictionaries also use subscripting syntax, so when you write dictionary["hello"], you’ll get the value associated with the key hello.

Like arrays, Swift dictionaries are constant if you declare them with let and mutable if you declare them with var. Similarly on the Foundation side, there are both NSDictionary and NSMutableDictionary classes for you to use.

Another characteristic that is similar to Swift arrays is that dictionaries are strongly typed, and you must have known key and value types. NSDictionary objects are able to take any NSObject as a key and store any object as a value.

You’ll see this in action when you call a Cocoa API that takes or returns an NSDictionary. From Swift, this type appears as [NSObject: AnyObject]. This indicates that the key must be an NSObject subclass, and the value can be any Swift-compatible object.

When to use Dictionaries


My cat knows when you use dictionaries. Do you?

Dictionaries are best used when there isn’t a particular order to what you need to store, but there is a meaningful association to what you need to store.

To help you examine how dictionaries and other data structures in the rest of this tutorial work, create a Playground by going to File…\New\Playground, and name it DataStructures.

For example, pretend you need to store a data structure of all your friends and their cats’s names, so you can lookup the cat’s name by your friend’s name. This way, you don’t have to remember the cat’s name to stay in your friends’ good graces.

First, you’d want to store the dictionary of people and cats:

let cats = [ "Ellen" : "Chaplin", "Lilia" : "George Michael", "Rose" : "Friend", "Bettina" : "Pai Mei"]

Thanks to Swift type inference, this will be defined as `[String: String]` – a dictionary with string keys and string values.

Now try to access items within it:

cats["Ellen"] //returns Chaplin as an optional
cats["Steve"] //Returns nil

Note that subscripting syntax on dictionaries returns an optional. If the dictionary doesn’t contain a value for a particular key, the optional is nil; if it does contain a value for that key you could get the wrapped value.

Because of that, it’s a good idea to use the if let optional-unwrapping syntax to access values in a dictionary:

if let ellensCat = cats["Ellen"] {
	println("Ellen's cat is named \(ellensCat).")
} else {
	println("Ellen's cat's name not found!")

Since there is a value for the key “Ellen”, this will print out “Ellen’s cat is named Chaplin” in your Playground.

Expected Performance

Once again, Apple outlines the expected performance of dictionaries in Cocoa in the CFDictionary.h header file:

  1. The performance degradation of getting a single value is guaranteed to be at worst O(n), but will often be O(1).
  2. Insertion and deletion can be as bad as O(n²), but will much more typically be closer to O(1) because of under-the-hood optimizations.

These aren’t quite as obvious as the array degradations. Due to the more complex nature of storing keys and values versus an ordered list of stuff, the effects can be less predictable.

Sample App Testing Results

DictionaryManipulator is a similar protocol to ArrayManipulator, and it tests dictionaries. With it, you can easily test the same operation using an NSMutableDictionary or a Swift Dictionary.

To compare the Swift and Cocoa dictionaries, you use a similar procedure as you used for the arrays. Build and run the app, and select the Dictionary tab at the bottom. Run a few tests and make a note of the numbers.

Back in Xcode, open DictionaryViewController.swift and find the dictionaryManipulator property:

let dictionaryManipulator: DictionaryManipulator = SwiftDictionaryManipulator()

Replace it with the following:

let dictionaryManipulator: DictionaryManipulator = NSDictionaryManipulator()

Now the app will use NSDictionary under the hood. Build and run the app again, and run a few more tests.

Sad Swift

Once again, creating a Swift Dictionary is slow and more prone to degradation than an NSDictionary. When you create an NSDictionary it degrades at O(n). Swift dictionaries degrade between O(n) and O(n²).

Swift’s performance here doesn’t live up to its name. On an iPhone 5, creating a 25,000-item Swift Dictionary took an average of over four seconds, vs 1/50th of a second for an NSMutableDictionary.

Adding and removing any number of objects in an NSDictionary is highly efficient, and it floats right around the O(1) best-case scenario promised by Apple’s documentation. For Swift dictionaries, adding and removing degradations for any number of objects lands between O(log n) and O(n).

Looking up objects in Swift dictionaries isn’t quite as dramatic an improvement as looking up objects in Swift arrays, but it’s still at least twice as fast to look up an object in a Swift dictionary as in an NSMutableDictionary. The degradation for both lookups is close to O(1).



A Set is a data structure that stores unordered, unique values. When you try to add the same item to a set more than once, the item will not be added. “Sameness” here is determined by isEqual().

Sets are also the one major types of data structure in iOS that does not (yet) have a native Swift equivalent to its Foundation API. Therefore, you’ll only be looking at NSSet.

When to use NSSets

Sets are most useful when uniqueness matters, but order doesn’t. For example, what if you wanted to select four random names out of an array of eight names, with no duplicates?

Enter the following into your Playground:

let names = ["John", "Paul", "George", "Ringo", "Mick", "Keith", "Charlie", "Ronnie"]
let mutableSet = NSMutableSet() // 1
var loopsCount = 0 
while mutableSet.count < 4 {
  let randomNumber = arc4random_uniform(UInt32(countElements(names))) //2
  let randomName = names[Int(randomNumber)] //3
  println(randomName) //4
  mutableSet.addObject(randomName) //5
  ++loopsCount //6
println("Loops: " + loopsCount.description + ", Mutable set contents: " + mutableSet.description)

In this little code snippet, you’re doing the following:

  1. Initialize the set, so you can add objects to it — note that that you must again use a mutable variant of NSSet to be able to add or remove items.
  2. Pick a random number between 0 and the count of names.
  3. Grab the name at the selected index.
  4. Log the selected name to the console.
  5. Add the selected name to the mutable set. Remember, if the name is already in the set the set will stay unchanged since it doesn’t store duplicates.
  6. Increment the loop counter so you can see how many times the loop ran.
  7. Once the loop finishes, print out the loop counter and the contents of the mutable set.

Since this example uses a random number generator, you’ll get a different result every time. Here’s an example of the log one of the times it ran while writing this tutorial:

Loops: 6, Mutable set contents: {(

Here, the loop ran six times in order to get four unique names. It selected Ronnie and Ringo twice, but only wound up in the set once.

As you’re writing the loop in the Playground, you’ll notice that it runs over and over and over again, and you’ll get a different number of loops each time. It’ll always be at least four loops, since there must always be four items in the set to break out of the loop.

Now that you’ve seen NSSet at work on a small scale, it’s time to examine their performance with a larger batch.

Sample App Testing Results

Apple didn’t outline expectations for set performance as they did for dictionaries and arrays, so in this case you’ll just look at real-world performance. In addition, since there’s no Swift class to compare NSMutableSets to, the test results are quite a bit simpler.

If you’re looking for pure speed, an NSMutableSet probably won’t make you happy. Compare the numbers on NSMutableSet to the numbers for NSMutableArray, and you’ll see set creation is considerably slower — that’s the price you pay for checking if every single item in a data structure is unique.

Creation degrades at a rate of O(n), but operations like adding, removing or looking up increase at O(1). Now you realize the benefits of using a set, since doing anything afterwards is faster than with arrays or dictionaries.

NSMutableSet lookup is considerably faster at looking up objects than NSMutableArray. Sets use hashes to check for equality, and the hashes can be calculated and stored in sorted order. This makes set lookup much faster than array lookup.

Lesser-known Foundation Data Structures

Arrays, dictionaries and sets are the workhorses of data-handling. However, Cocoa offers a number of lesser-known and perhaps under-appreciated collection types. If a dictionary, array or set won’t do the job, it’s worth checking if one of these will work before you create something from scratch.


Using NSCache is very similar to using NSMutableDictionary – you just add and retrieve objects by key. The difference is that NSCache is designed for temporary storage for things that you can always recalculate or regenerate. If available memory gets low, NSCache might remove some objects. They are thread-safe, but Apple’s documentation warns:

…The cache may decide to automatically mutate itself asynchronously behind the scenes if it is called to free up memory.

Basically, this means that an NSCache is like an NSMutableDictionary, except that an object can be automatically removed from another thread at any time without your code doing anything. This is good for managing how much memory the cache uses, but can cause issues if you’re trying to use an object that suddenly disappears.

NSCache also stores weak references to keys rather than strong references.


NSCountedSet tracks how many times you’ve added an object to a mutable set. It inherits from NSMutableSet, so if you try to add the same object again, it’s only in the set once.

However, in an NSCountedSet, the set tracks how many times that object was added. You can see how many times an object was added with countForObject().

Note that when you call count on an NSCountedSet, it only returns the count of unique objects, not the number of times all objects were added to the set.

For instance, in your Playground, take the array of names you created in your earlier NSMutableSet testing, and add each one to an NSCountedSet twice:

let countedMutable = NSCountedSet()
for name in names {

Then, log out of the set itself and find out how many times “Ringo” was added:

let ringos = countedMutable.countForObject("Ringo")
println("Counted Mutable set: \(countedMutable)) with count for Ringo: \(ringos)")

Your log should read:

Counted Mutable set: {(
)}) with count for Ringo: 2

Note that while you may see a different order for the set, you should only see “Ringo” appear in the list of names once, even though you can see that it was added twice.


An NSOrderedSet along with its mutable counterpart, NSMutableOrderedSet, is a data structure that allows you to store a group of distinct objects in a specific order.

“Specific order” — gee, that sounds an awful lot like an array, doesn’t it? Apple succinctly sums up why you’d want to use an NSOrderedSet instead of an array (emphasis mine):

You can use ordered sets as an alternative to arrays when element order matters and performance while testing whether an object is contained in the set is a consideration — testing for membership of an array is slower than testing for membership of a set.

Because of this, the ideal time to use an NSOrderedSet is when you need to store an ordered collection of objects that cannot contain duplicates.

Note: While NSCountedSet inherits from NSMutableSet, NSOrderedSet inherits from NSObject. This is a great example of how Apple names classes based on what they do, but not necessarily how they work under the hood.

NSHashTable and NSMapTable

Map table?

Not this kind of Map Table. (Courtesy the Tennessee Valley Authority(!) via Flickr Creative Commons)

NSHashTable is another data structure that is similar to a set, but with a few key differences from NSMutableSet.

You can set up an NSHashTable using any arbitrary pointers and not just objects, so you can add structures and other non-object items to an NSHashTable. You can also set memory management and equality comparison terms explicitly using NSHashTableOptions enum.

NSMapTable is a dictionary-like data structure, but with similar behaviors to NSHashTable when it comes to memory management.

Like an NSCache, an NSMapTable can hold weak references to keys. However, it can also remove the object related to that key automatically whenever the key is deallocated. These options can be set from the NSMapTableOptions enum.


An NSIndexSet is an immutable collection of unique unsigned integers intended to represent indexes of an array.

If you have an NSArray of ten items where you regularly need to access items’ specific positions, you can store an NSIndexSet and use NSArray’s objectsAtIndexes: to pull those objects directly:

let items : NSArray = ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"]
let indexSet = NSMutableIndexSet()
items.objectsAtIndexes(indexSet) // returns ["four", "nine", "ten"]

You specify that “items” is an NSArray since right now, Swift arrays don’t have an equivalent way to access multiple items using an NSIndexSet or a Swift equivalent.

An NSIndexSet retains the behavior of NSSet that only allows a value to appear once. Hence, you shouldn’t use it to store an arbitrary list of integers unless only a single appearance is a requirement.

With an NSIndexSet, you’re storing indexes as sorted ranges, so it is more efficient than storing an array of integers.

Pop Quiz!

Now that you’ve made it this far, you can test your memory with a quick quiz about what sort of structure you might want to use to store various types of data.

For the purposes of this quiz, assume you have an application where you display information in a library.

Q: What would you use to create a list of every author in the library?

Solution Inside: Unique List of Author Names SelectShow

A Set! It automatically removes duplicate names, which means you can enter every single author’s name as many times as you want, but you’ll still have only a single entry — unless you mistype an author’s name, doh!

Once you create the Set, you can use set.allObjects to access the Array of unique names, and then perform operations that depend on order such as sorting.

Q: How would you store the alphabetically sorted titles of a prolific author’s entire body of work?

Solution Inside: Alphabetically-Sorted Titles SelectShow

An Array! Since you’re in a situation where you have a number of similar objects (all titles are Strings) and their order matters (titles must be sorted alphabetically) this is an ideal case for an Array.

Q: How would you store the most popular book by each author?

Solution Inside: Most Popular Book by each author SelectShow

A Dictionary! If you use the author’s name as the key and the title of that author’s most popular book as the value, you can access the most popular book by any author like this:

mostPopularBooks["Gillian Flynn"] //Returns "Gone Girl"

Where to Go From Here?

I’d like to give special thanks to my fellow tutorial team member Chris Wagner, who started on an Objective-C version of this article before the SwiftHammer came down upon us all, for passing along his notes and sample project for me to use while pulling this tutorial together.

I’ll also say thanks to the Swift team at Apple — despite the fact that there’s still considerable room for performance improvement in the native data structures, I had an awful lot of fun writing and testing in Swift. I may continue to use the Foundation data structures for a while, but I’m looking forward to porting a few little toys over to Swift. :]

Anyway, if you want to learn more about data structures for iOS, here are a few excellent resources:

If you’d like to dissect the numbers presented in this article further, you can download the numbers spreadsheet used to track all the testing runs and analyze the data yourself.

Got more questions or want to dissect these numbers further? Go nuts in the comments below!

Collection Data Structures In Swift is a post from: Ray Wenderlich

The post Collection Data Structures In Swift appeared first on Ray Wenderlich.



Write a comment