How To Make An Augmented Reality Target Shooter Game With OpenCV: Part 1/4

How To Make An Augmented Reality Target Shooter Game With OpenCV: Part 1/4

Open CV Tutorial

Make the world your virtual target range in this Augmented Reality tutorial

Looking to make your neighbor’s car explode by an errant cruise missile without the the hassle of court dates? Interested in seeing how new furniture looks good in your house without having to buy it first? Ever wanted to look at the stars and have the constellations mapped for you along real time annotations?

These are all real examples of the type of apps that you can create with Augmented Reality. Augmented Reality (AR) is an exciting technology that blends, or augments, a real-time video stream with computer-generated sensory inputs such as sound, graphics or geolocation information.

Some of the most engaging mobile apps today use Augmented Reality, such as Action FX, SnapShop Showroom and Star Chart. They’ve all been huge hits in their own right and new technologies like Google Glass continue to expand the possibilities of AR in the future.

This tutorial showcases the AR capabilities of iOS in a fun and entertaining AR Target Shooter game. You’ll be using the popular OpenCV computer vision library as the foundation of your app, but you won’t need anything more than a basic familiarity with UIKit, CoreGraphics and some elementary C++ programming.

In this tutorial, you’ll learn how to:

  • Blend native C++ code into your iOS projects for added performance.
  • Use OpenCV to incorporate computer vision and pattern recognition into your iOS apps.
  • Use sprites, sounds and animations to give your iOS games a little extra sizzle.
Note: Although it’s possible to run this tutorial on the simulator, the most salient features of your game — those involving computer vision and pattern recognition — are only available on a physical iOS device with a rear-facing camera.

Ready to add the next level of interaction to your iOS apps? Then it’s time to get started!

Getting Started

Download the starter project for this tutorial and extract it to a convenient location.

The first thing you’ll need to do is to integrate the OpenCV SDK with the starter project.

The easiest way to do this is to use CocoaPods, a popular dependency management tool for iOS.

Note: CocoaPods is designed to be simple and easy to use, but if you want a full tour of all the features available in CocoaPods, check out the Introduction to CocoaPods tutorial on this site. Also, check out our latest tech talk on CocoaPods by team members Cesare Rocchi and Orta Therox. Orta is actually a developer on the CocoaPods team and fields a lot of interesting questions about the tool.

CocoaPods is distributed as a ruby gem; this means that installing it is pretty straightforward. Open up a Terminal window, type in the following command and hit Return:

$ [sudo] gem install cocoapods

You may have to wait for a few moments while the system installs the necessary components.

Once the command has completed and you’ve been returned to the command prompt, type in the following command:

$ pod setup

That’s all there is to installing CocoaPods. Now you’ll need a Podfile to integrate OpenCV with your project.

Using Terminal, cd to the top level directory of the starter project; this is the directory where OpenCVTutorial.xcodeproj lives.

To verify this, type ls and hit Return at the command prompt; you should see some output as shown below:

$ ls
OpenCVTutorial			OpenCVTutorial.xcodeproj

Fire up your favorite text editor and create a file named Podfile in this directory.

Add the following line of code to Podfile:

pod 'OpenCV'

Save Podfile and exit back to the command shell. Then type the following command and hit Return:

$ pod

After a few moments, you should see some log statements indicating that the necessary dependencies have been analyzed and downloaded and that OpenCV is installed and ready to use.

Note: If you run the pod command and receive the following text: “Pull is not possible because you have unmerged files.” – do not panic. Check out this helpful blog post here from the Cocoapods team and follow their instructions.

An example shell session is indicated below:


Type ls again at the command prompt; you should see the list of files below:

$ ls
OpenCVTutorial			Podfile
OpenCVTutorial.xcodeproj	Podfile.lock
OpenCVTutorial.xcworkspace	Pods

OpenCVTutorial.xcworkspace is a new file — what does it do?

Xcode provides workspaces as organizational tools to help you manage multiple, interdependent projects. Each project in a workspace has its own separate identity — even while sharing common libraries with other projects across the entire workspace.

Note: From now on, do as the warning in the screenshot above indicates: always work in the workspace (OpenCVTutorial.xcworkspace), and not in the normal project. Our Cocoapods tech talk covers why should do this and what happens when you don’t. Are you convinced yet that you should watch it? You can watch it over here ;].

Open OpenCVTutorial.xcworkspace in Xcode by double-clicking on the file in the Finder.

Note: If you’re a command-line junkie, you can also open the workspace via the following shell command:

$ open OpenCVTutorial.xcworkspace

Once you have the workspace open in Xcode, take a look at the Navigator. You’ll see the following two projects which are now part of the workspace:


The first project — OpenCVTutorial — is the original starter project that you just downloaded.

The second project — Pods — contains the OpenCV SDK; CocoaPods added this to the workspace for you.

The OpenCV CocoaPod takes care of linking most of the iOS Frameworks required for working with Augmented Reality, including AVFoundation, Accelerate, AssetsLibrary, CoreImage, CoreMedia, CoreVideo and QuartzCore.

However, this list doesn’t contain any Frameworks that support sound effects. You’ll add these yourself.

Add the AudioToolkit Framework to the OpenCVTutorial project as follows:

  1. Click on the OpenCVTutorial project in the Navigator pane.
  2. Click on the OpenCVTutorial target in the Editor pane of Xcode.
  3. Select the Build Phases tab from the menu at the top and click on the Link Binary With Libraries menu item.

You should see the following panel:


Click on the + icon at the bottom of the list and add the AudioToolbox Framework to the OpenCVTutorial project.

Your Link Binary With Libraries menu item should now look similar to the following:


Open OpenCVTutorial-Prefix.pch; you can find it under the Supporting Folders group in the Navigator, like so:


Add the following code to OpenCVTutorial-Prefix.pch, just above the line that reads #ifdef __OBJC__:

#ifdef __cplusplus
#include <opencv2/opencv.hpp>

The *.pch extension indicates that this is a prefix header file.

Prefix header files are a feature of some C/C++ compilers, including Xcode.

Any instructions or definitions found in the prefix header file will be included automatically by Xcode at the start of every source file. By adding the above preprocessor directive to OpenCVTutorial-Prefix.pch, you’ve instructed Xcode to add the header file to the top of every C++ file in your project.

When working with C++ in Objective-C files, you must change your source code filename extension from *.m to *.mm for those files. The *.mm filename extension tells Xcode to use the C++ compiler, while *.m instructs Xcode to use the standard C compiler.

Note: OpenCV was launched in 1999 as part of an Intel Research initiative to advance CPU-intensive applications. A major update to the library, named OpenCV2, was subsequently released in October 2009. You’ll use OpenCV2 in this tutorial, which is why the header path in the #include directive points to opencv2.

Build your project at this point to make sure that everything compiles without any errors.

Working with AVFoundation

The first component you’ll add to your game is a “real-time” live video feed.

To do this, you’ll need some type of VideoSource object able to retrieve raw video data from the rear-facing camera and forward it on to OpenCV for processing.

If you haven’t worked much with Apple’s AVFoundation Framework, one way to conceptualize it is like a giant, yummy sandwich:


  • On the top layer of the sandwich are Input Ports which take in raw data from the various A/V hardware components on your iOS device. These components might include the front- or rear-facing cameras, or the microphone.
  • On the bottom layer of the sandwich are Output Ports, which are delegates interested in using the A/V data captured by the Input Ports. For example, you might configure an Output Port to capture data that arrives from the camera and save it to disk as a Quicktime movie.
  • In between these two layers is the all-important, magical AVCaptureSession, which coordinates and shuffles around all the data between the Input and Output Ports.

There’s a partially implemented VideoSource object included in your starter project to help get you started. For the remainder of Part 1 of this tutorial, you’ll complete the the implementation of VideoSource and get your live video feed up and running.

Designing Your Video Stream

Start by opening up VideoSource.h, found under the Video Source group.

If you’re having trouble finding the file, you can try typing VideoSource directly into the Filter bar at the bottom of the Navigator, as shown below:


Only the files whose names match the string you are typing in the Filter bar will be displayed in the Navigator.

Note: Filtering directories like this can help you quickly navigate to a specific file in a large Xcode project. Just be sure to press the x icon at the right side of the Filter bar once you’re done. Otherwise, all the listings in the Navigator will always be filtered, which is not the behavior you usually need!

Open up VideoSource.h and have a look through the file:

#import <AVFoundation/AVFoundation.h>
#import "VideoFrame.h"
#pragma mark -
#pragma mark VideoSource Delegate
@protocol VideoSourceDelegate <NSObject>
- (void)frameReady:(VideoFrame)frame;
#pragma mark -
#pragma mark VideoSource Interface
@interface VideoSource : NSObject
@property (nonatomic, strong) AVCaptureSession * captureSession;
@property (nonatomic, strong) AVCaptureDeviceInput * deviceInput;
@property (nonatomic, weak) id<VideoSourceDelegate> delegate;
- (BOOL)startWithDevicePosition:(AVCaptureDevicePosition)devicePosition;

There’s a few things in this file that are worth mentioning:

  • VideoSource declares a strong property named captureSession.

    captureSession is an instance of AVCaptureSession which is described above; its purpose is to coordinate the flow of video data between the rear-facing camera on your iOS device, the output ports that you’re going to configure below, and ultimately OpenCV.

  • VideoSource declares a strong property named deviceInput.

    deviceInput is an instance of AVCaptureDeviceInput and acts as an input port that can attach to the various A/V hardware components on your iOS device. In the next section, you’re going to associate this property with the rear-facing camera and add it as an input port for captureSession.

  • The header file also declares a protocol named VideoSourceDelegate.

    This protocol is the “glue” between the output ports for captureSession and OpenCV. Whenever one of your output ports is ready to dispatch a new video frame to OpenCV, it will invoke the frameReady: callback on the delegate member of VideoSource.

Open VideoFrame.h and have a look through it as well:

#ifndef OpenCVTutorial_VideoFrame_h
#define OpenCVTutorial_VideoFrame_h
#include <cstddef>
struct VideoFrame
    size_t width;
    size_t height;
    size_t stride;
    unsigned char * data;

This file declares a simple C-struct that you’ll use to hold your video frame data.

Take the example where you’re capturing video at the standard VGA resolution of 640×480 pixels:

  • The value of width will be 640.
  • The value of height will be 480.
  • The value of stride will be given by the number of bytes per row.

In this example, if you are capturing video data using frames that are 640 pixels wide, and 4 bytes are used to represent each pixel, the value of stride will be 2560.

The data attribute, of course, is simply a pointer to the actual video data.

Building Your Video Stream

You’ve been patient — it’s now time to start writing some code!

Open and replace the stubbed-out implementation of init with the code below:

- (id)init {
    self = [super init];
    if ( self ) {
        AVCaptureSession * captureSession = [[AVCaptureSession alloc] init];
        if ( [captureSession canSetSessionPreset:AVCaptureSessionPreset640x480] ) { 
            [captureSession setSessionPreset:AVCaptureSessionPreset640x480]; 
            NSLog(@"Capturing video at 640x480");
        } else {
            NSLog(@"Could not configure AVCaptureSession video input");
        _captureSession = captureSession;
    return self;

Here the constructor simply creates a new instance of AVCaptureSession and configures it to accept video input at the standard VGA resolution of 640×480 pixels.

If the device is unable to accept video input at this resolution, an error is logged to the console.

Next, add the following definition for dealloc to, just after init:

- (void)dealloc {
    [_captureSession stopRunning];

When an instance of a VideoSource class is deallocated, it’s a good idea to stop captureSession as well.

Replace the stubbed-out cameraWithPosition: in with the following code:

- (AVCaptureDevice*)cameraWithPosition:(AVCaptureDevicePosition)position {
    NSArray * devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
    for ( AVCaptureDevice * device in devices ) {
        if ( [device position] == position ) {
            return device;
    return nil;

Most iOS devices these days ship with both front- and rear-facing cameras.

For today’s AR shooter game, you’re only going to be interested in the rear-facing camera.

Nevertheless, it’s a good practice to write code that is general enough to be reused in different ways. cameraWithPosition: is a private helper method that lets the caller obtain a reference to the camera device located at the specified position.

You’ll pass in AVCaptureDevicePositionBack to obtain a reference to the rear-facing camera.

If no camera device is found at the specified position, the method returns nil.


Start the Video Stream

The next thing you’ll need to do is implement the public interface for VideoSource.

Replace the stubbed-out implementation of startWithDevicePosition: in with the following code:

- (BOOL)startWithDevicePosition:(AVCaptureDevicePosition)devicePosition {
    // (1) Find camera device at the specific position
    AVCaptureDevice * videoDevice = [self cameraWithPosition:devicePosition];
    if ( !videoDevice ) {
        NSLog(@"Could not initialize camera at position %d", devicePosition);
        return FALSE;
    // (2) Obtain input port for camera device
    NSError * error;
    AVCaptureDeviceInput *videoInput = [AVCaptureDeviceInput deviceInputWithDevice:videoDevice error:&error];
    if ( !error ) {
        [self setDeviceInput:videoInput];
    } else {
        NSLog(@"Could not open input port for device %@ (%@)", videoDevice, [error localizedDescription]);
        return FALSE;
    // (3) Configure input port for captureSession 
    if ( [self.captureSession canAddInput:videoInput] ) {
        [self.captureSession addInput:videoInput];
    } else {
        NSLog(@"Could not add input port to capture session %@", self.captureSession);
        return FALSE;
    // (4) Configure output port for captureSession
    [self addVideoDataOutput];
    // (5) Start captureSession running
    [self.captureSession startRunning];
    return TRUE;

Here’s what’s going on in this method:

  1. You call the helper method cameraWithPosition: defined above. The call returns with a reference to the camera device located at the specified position and you save this reference in videoDevice.
  2. You then call the static method deviceInputWithDevice:error: defined on the class AVCaptureDeviceInput and pass in videoDevice as the argument. The method configures and returns an input port for videoDevice, and saves it in a local variable named videoInput. If the port can’t be configured, then log an error.
  3. You next add videoInput to the list of input ports for captureSession and log an error if anything fails with this operation.
  4. The call to addVideoDataOutput configures the output ports for captureSession. This method is as yet undefined. You’re going to implement it in the next section.
  5. startRunning is an asynchronous call that starts capturing video data from the camera and dispatches it to captureSession.

Dispatching to Queues

It’s worth taking a moment to review the finer points of concurrency and multithreading.

Grand Central Dispatch, or CGD, was first introduced in iOS4 and became the de facto way to manage concurrency on iOS. Using GCD, developers submit tasks to dispatch queues in the form of code blocks which are then run on a thread pool managed by GCD. This frees the developer of managing multiple threads by hand and all the requisite headaches that go along with that! :]

GCD dispatch queues come in three basic flavors:

  • Serial Dispatch Queue – Tasks are dequeued in first-in, first-out (FIFO) order and run one at a time.
  • Main Dispatch Queue – Tasks are dequeued in FIFO order, run one at a time and are guaranteed to run only on the application’s main thread.
  • Global Dispatch Queue – Tasks are dequeued in FIFO order, but they run concurrently with respect to one another. So although they start in their queued order, they may finish in any order.

While the serial queue may seem like the easiest queue to understand, it’s not necessarily the one iOS developers use most frequently.

The main thread handles the the UI of your application; it’s the only thread permitted to call many of the crucial methods in UIKit. As a result, it’s quite common to dispatch a block to the main queue from a background thread to signal the UI that some non-UI background process has completed, such as a long-running computation or waiting on a response to a network request.

Global dispatch queues — also known as concurrent dispatch queues — aren’t typically used for signalling. These queues are most frequently used by iOS developers for structuring the concurrent execution of background tasks.

However, in this portion of the project you need to use a serial dispatch queue so you can process video frames in background threads so that you don’t block the UI. This also ensures that each frame is handled in the order it was received.

Note: If you’d like a more thorough introduction to Grand Central Dispatch, dispatch queues, and concurrency in general, take a look at the Multithreading and Grand Central Dispatch on iOS for Beginners Tutorial on this site. The section on dispatch queues in Apple’s own Concurrency Programming Guide is helpful and worth a read.

In this section you’re going to use GCD to configure the output ports for captureSession. These output ports format the raw video buffers as they’re captured from the camera and forward them on to OpenCV for further processing.

Before moving forward, you will need to inform the compiler that the VideoSource class adheres to the AVCaptureVideoDataOutputSampleBufferDelegate protocol. This formidable-sounding protocol declares the delegate methods for captureSession. You’ll learn more about this protocol further along in the tutorial, but for now, it’s all about making the compiler happy.

Update the class extension at the top of as follows:

@interface VideoSource () <AVCaptureVideoDataOutputSampleBufferDelegate>

Next, open and replace the stubbed-out addVideoDataOutput with the following code:

- (void) addVideoDataOutput {
    // (1) Instantiate a new video data output object
    AVCaptureVideoDataOutput * captureOutput = [[AVCaptureVideoDataOutput alloc] init];
    captureOutput.alwaysDiscardsLateVideoFrames = YES;
    // (2) The sample buffer delegate requires a serial dispatch queue
    dispatch_queue_t queue;
    queue = dispatch_queue_create("com.raywenderlich.tutorials.opencv", DISPATCH_QUEUE_SERIAL);
    [captureOutput setSampleBufferDelegate:self queue:queue];
    // (3) Define the pixel format for the video data output
    NSString * key = (NSString*)kCVPixelBufferPixelFormatTypeKey;
    NSNumber * value = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA];
    NSDictionary * settings = @{key:value};
    [captureOutput setVideoSettings:settings];
    // (4) Configure the output port on the captureSession property
    [self.captureSession addOutput:captureOutput];

Taking each numbered comment in turn:

  1. You create a new instance of AVCaptureVideoDataOutput named captureOutput; setting alwaysDiscardsLateVideoFrames to YES gives you improved performance at the risk of occasionally losing late frames.
  2. You create a new serial dispatch queue on which you invoke callbacks whenever captureSession is ready to vend a new video buffer. The first parameter to dispatch_queue_create() identifies the queue and can be used by tools such as Instruments. The second parameter indicates you wish to create a serial, rather than a concurrent queue; in fact, DISPATCH_QUEUE_SERIAL is #defined as NULL and it’s common to see serial dispatch queues created simply by passing in NULL as the second parameter. You also call dispatch_release() as the starter project has a release target of iOS 5.
  3. The 32-bit BGRA pixel format is widely supported across the entire iOS device family and works well with both Core Graphics and OpenGL. It’s the natural choice to use as the pixel format for the video output.
  4. You then add captureOutput to the list of output ports for captureSession.
Note: Serial dispatch queues are reference-counted objects. In iOS6 and later, dispatch queues are Objective-C objects and will be retained and released automatically if you use ARC. In earlier versions of iOS, dispatch queues were Core Foundation objects and reference counting had to be handled manually using dispatch_retain and dispatch_release.

Delegates and Finishing Touches

It’s time to implement the AVCaptureVideoDataOutputSampleBufferDelegate protocol. You will use this protocol to format the raw video buffers from the camera and dispatch them as video frames to OpenCV.

The protocol only declares the following two methods, both optional:

  • captureOutput:didOutputSampleBuffer:fromConnection: notifies the delegate that a video frame has successfully arrived from the camera. This is the method you’re going to be implementing below.
  • captureOutput:didDropSampleBuffer:fromConnection: notifies the delegate that a video frame has been dropped. You won’t be needing this method for today’s tutorial.

Add the following method definition to the very end of

#pragma mark -
#pragma mark Sample Buffer Delegate
- (void)captureOutput:(AVCaptureOutput *)captureOutput 
       fromConnection:(AVCaptureConnection *)connection 
    // (1) Convert CMSampleBufferRef to CVImageBufferRef
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // (2) Lock pixel buffer
    CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
    // (3) Construct VideoFrame struct
    uint8_t *baseAddress = (uint8_t*)CVPixelBufferGetBaseAddress(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    size_t stride = CVPixelBufferGetBytesPerRow(imageBuffer);
    VideoFrame frame = {width, height, stride, baseAddress};
    // (4) Dispatch VideoFrame to VideoSource delegate
    [self.delegate frameReady:frame];
    // (5) Unlock pixel buffer
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

Looking at each numbered comment, you’ll see the following:

  1. You dispatch video data to the delegate in the form of a CMSampleBuffer which is subsequently converted to a pixel buffer of type CVImageBuffer named imageBuffer.
  2. Next, you must lock the pixel buffer until you’re done using it since you’re working with pixel data on the CPU. As you’re not going to modify the buffer while you’re holding the lock, you invoke the method with kCVPixelBufferLock_ReadOnly for added performance.
  3. Construct a new video frame using the relevant data and attributes from the pixel buffer.
  4. Dispatch the newly-constructed video frame to the VideoSource delegate. This is the point where OpenCV picks up the frame for further processing.
  5. Release the lock on the pixel buffer.
Note: When accessing pixel data on the GPU instead of the CPU, locking the base address in this way is not necessary and may even impair performance. The only iOS frameworks that utilize the GPU are OpenGL, OpenCL and Core Image, which you aren’t using in this tutorial.

Now you can turn to ViewController and implement the final delegate callback.

Add the following declaration to the bottom of

#pragma mark -
#pragma mark VideoSource Delegate
- (void)frameReady:(VideoFrame)frame {
    __weak typeof(self) _weakSelf = self;
    dispatch_sync( dispatch_get_main_queue(), ^{
        // Construct CGContextRef from VideoFrame
        CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
        CGContextRef newContext = CGBitmapContextCreate(,
                                                        kCGBitmapByteOrder32Little | 
        // Construct CGImageRef from CGContextRef
        CGImageRef newImage = CGBitmapContextCreateImage(newContext);
        // Construct UIImage from CGImageRef
        UIImage * image = [UIImage imageWithCGImage:newImage];
        [[_weakSelf backgroundImageView] setImage:image];

frameReady: is the callback method defined in VideoSourceDelegate. It takes a video frame, uses a set of straightforward but tedious Core Graphics calls to convert it into a UIImage instance, then renders it for on-screen display.

There are a couple things worth noting about the method:

  • Pretty much all UIKit operations need to be carried out on the main thread, and this includes constructing new UIImage objects from a Core Graphics context. For this reason, you use GCD to synchronously run the conversion code block back on the main thread.
  • backgroundImageView is an IBOutlet defined on MainStoryboard. Its type is UIImageView and as the name suggests, it’s configured to form the background image view for the entire game. frameReady: runs every time VideoSource dispatches a new video frame, which should happen at least 20 to 30 times per second. Rendering a steady stream of UIImage objects on-screen creates an illusion of fluid video.

Again, as with VideoSource, it’s good form to declare which protocols the class conforms to in the class extension.

Update the class extension at the top of as follows:

@interface ViewController () <VideoSourceDelegate>

Finally, you need to instantiate a new VideoSource object before you can use it.

Update viewDidLoad in as follows:

- (void)viewDidLoad
    [super viewDidLoad];
    // Configure Video Source
    self.videoSource = [[VideoSource alloc] init];
    self.videoSource.delegate = self;
    [self.videoSource startWithDevicePosition:AVCaptureDevicePositionBack];

Here you use AVCaptureDevicePositionBack to capture video frames from the rear-facing camera.

Build and run your project; hold your device up and you’ll see how the virtual A/V “sandwich” has come together to give you “live video” on your device:


You’ve just taken one small step for man, and one giant leap for your AR Target Shooter game!

Where to Go From Here?

This concludes the first part of your AR Target Shooter game. You’ve assembled a tasty sandwich from the ingredients of the AVFoundation Framework, and you now have your “live” video feed.

You can download the completed project for this part as a zipped project file.

In Part 2 of this tutorial, you’ll add some HUD overlays to the live video, implement the basic game controls, and dress up the game with some explosion effects. Oh yeah, I said explosions, baby! :]

If you have any questions or comments, please come join the discussion below!

How To Make An Augmented Reality Target Shooter Game With OpenCV: Part 1/4 is a post from: Ray Wenderlich

The post How To Make An Augmented Reality Target Shooter Game With OpenCV: Part 1/4 appeared first on Ray Wenderlich.



Write a comment