How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7

How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7

Narrated Book with Speech Control

Your UI With Speech Control Buttons

With the introduction of the PageViewController, Apple has made it easy for developers to make their own book apps. Unfortunately, in the busy schedule of nine-to-five living, reading can be more of a luxury. Wouldn’t be to nice to have a book narrated to you while you multitask?

With the introduction of Siri, Apple has taunted developers with the implication of dynamic spoken text, but with the release of iOS 7, Apple has finally opened the door.

Introducing, AVSpeechSynthesizer. Or Siri-Synthesizer for short :]

In this tutorial, you’ll create a narrated book. Each page of the book will display text while simultaneously speaking the text. Audio narration is a splendid way to make your book app stand out from all the others on iTunes, while also accommodating those with visual impairments. Offering an audio book app can also make your work more appealing to a broader audience, since they allow people to “read” while they exercise, cook or get a little work done.

As you create your narrated book, you’ll learn:

  • How to make an iOS device speak text using AVSpeechSynthesizer and AVSpeechUtterance.
  • How to make this synthesized speech sound more natural by modifying AVSpeechUtterance properties like pitch and rate.

AVSpeechSynthesizer may not win any awards for voice acting, but you can use it relatively easily to enhance functionality in apps you develop in the future.

Note: If you are interested in developing children’s books on the iPad using Sprite Kit, check out Tammy Coron’s excellent tutorial over here: How to Create an Interactive Children’s Book for the iPad

Getting Started with AVSpeechSynthesizer

Start by downloading the Starter Project. Open the project in Xcode by navigating into the NarratedBookUsingAVSpeech Starter directory and double-clicking on the NarratedBookUsingAVSpeech.xcodeproj project file. You should see something similar to the image below:

First Open in Xcode

Build and run the project. You will see the following in the simulator:

First Run of Your App Page 1 - After Swiping Left-to-right

Your first book is a nursery rhyme about squirrels. It’s not exactly Amazon Top Selling material, but it will do for the purposes of this tutorial. Use your mouse to swipe from right-to-left in the simulator, and you’ll move to the next page as below.

Second Page of App

Use your mouse to swipe from left-to-right in the simulator, and you’ll return to the first page. Wow, you already have a functioning book. Nice work!

Understanding the Plumbing

Note: At the end of this tutorial, there are a few challenges for you. This next section covers the sample project so you can take those challenges, but if you are not interested, feel free to skip to the next section.

The starter project has two important classes:

1. Models: These store your book as a single Book object and its collection of Page objects.
2. Presentation: These present your models on the screen and respond to user interaction (e.g. swipes).

If you choose to build on this project to make your own books, its important you understand how these work. Open RWTBook.h and examine its structure.

@interface RWTBook : NSObject
 
//1
@property (nonatomic, copy, readonly) NSArray *pages;
 
//2
+ (instancetype)bookWithPages:(NSArray*)pages;
//3
+ (instancetype)testBook;
 
@end
  1. The pages property stores an array of Page objects, each representing a single page in the book.
  2. bookWithPages: is a convenience method to initialize and return a book with the given array of Page objects.
  3. testBook creates your book for testing purposes. You’ll start writing and reading your own books soon enough, but testBook is a simple book that is perfect to get you started.

Open RWTPage.h and examine its structure.

//1
extern NSString* const RWTPageAttributesKeyUtterances;
extern NSString* const RWTPageAttributesKeyBackgroundImage;
 
@interface RWTPage : NSObject
 
//2
@property (nonatomic, strong, readonly) NSString *displayText;
@property (nonatomic, strong, readonly) UIImage *backgroundImage;
 
//3
+ (instancetype)pageWithAttributes:(NSDictionary*)attributes;
@end
  1. Accesses the constants for dictionary look-ups for each page. The RWTPageAttributesKeyUtterances constant corresponds to the text on each page of the book. The RWTPageAttributesKeyBackgroundImage constant returns each background image for the page
  2. The displayText property stores the text of the page that your book presents on-screen, and the backgroundImage stores the image behind the text.
  3. pageWithAttributes: initializes and returns a page with the given dictionary of attributes.

Finally, open RWTPageViewController.m and examine its structure:

#pragma mark - Class Extension
 
// 1
@interface RWTPageViewController ()
@property (nonatomic, strong) RWTBook *book;
@property (nonatomic, assign) NSUInteger currentPageIndex;
@end
 
@implementation RWTPageViewController
 
#pragma mark - Lifecycle
 
// 2
- (void)viewDidLoad
{
  [super viewDidLoad];
 
  [self setupBook:[RWTBook testBook]];
 
  UISwipeGestureRecognizer *swipeNext = [[UISwipeGestureRecognizer alloc]
                                          initWithTarget:self
                                                  action:@selector(gotoNextPage)];
  swipeNext.direction = UISwipeGestureRecognizerDirectionLeft;
  [self.view addGestureRecognizer:swipeNext];
 
  UISwipeGestureRecognizer *swipePrevious = [[UISwipeGestureRecognizer alloc]
                                              initWithTarget:self
                                                      action:@selector(gotoPreviousPage)];
  swipePrevious.direction = UISwipeGestureRecognizerDirectionRight;
  [self.view addGestureRecognizer:swipePrevious];
}
 
#pragma mark - Private
 
// 3
- (RWTPage*)currentPage
{
  return [self.book.pages objectAtIndex:self.currentPageIndex];
}
 
// 4
- (void)setupBook:(RWTBook*)newBook
{
  self.book = newBook;
  self.currentPageIndex = 0;
  [self setupForCurrentPage];
}
 
// 5
- (void)setupForCurrentPage
{
  self.pageTextLabel.text = [self currentPage].displayText;
  self.pageImageView.image = [self currentPage].backgroundImage;
}
 
// 6
- (void)gotoNextPage
{
  if ([self.book.pages count] == 0 || self.currentPageIndex == [self.book.pages count] - 1) {
    return;
  }
 
  self.currentPageIndex += 1;
  [self setupForCurrentPage];
}
 
// 7
- (void)gotoPreviousPage
{
  if (self.currentPageIndex == 0) {
    return;
  }
 
  self.currentPageIndex -= 1;
  [self setupForCurrentPage];
}
@end

Here’s what this code does:

  1. The book property stores the current book and, the currentPageIndex property stores the index of the current page in book.pages.
  2. Sets up the page display once your view loads, then adds gesture recognizers so you can swipe forwards and backwards through the book’s pages.
  3. Returns the current page within the current book.
  4. Sets the book property and makes sure you start at the first page.
  5. Set up the UI for the current page.
  6. Go to the next page, if applicable, and set it up. It’s invoked by the swipeNext gesture recognizer you created in viewDidLoad.
  7. Go to the previous page, if there is one, and set it up. This is invoked by the swipePrevious gesture recognizer you created in viewDidLoad.

To Speak or Not to Speak!

That is the question.

Open RWTPageViewController.m and underneath #import "RWTPage.h", add the following line:

@import AVFoundation;

iOS speech support is in the AVFoundation framework so you must import the AVFoundation module.

Note: The @import will both import and link the AVFoundation framework. To learn more about @import as well as some other new Objective-C language features in iOS 7, check out the article: What’s New in Objective-C and Foundation in iOS 7.

Add the following line just below the declaration of the currentPageIndex property in the RWTPageViewController class extension:

@property (nonatomic, strong) AVSpeechSynthesizer *synthesizer;

You’ve just added the speech synthesizer that will speak the words in each page.

Think of the AVSpeechSynthesizer you just added to your view controller as the person doing the speaking. AVSpeechUtterance instances represent the chunks of text the synthesizer speaks.

Note: An AVSpeechUtterance can be a single word like “Whisky” or an entire sentence, such as, “Whisky, frisky, hippidity hop.”

Add the following code just before the @end at the bottom of RWTPageViewController.m

#pragma mark - Speech Management
 
- (void)speakNextUtterance
{
  AVSpeechUtterance *nextUtterance = [[AVSpeechUtterance alloc]
                                       initWithString:[self currentPage].displayText];
  [self.synthesizer speakUtterance:nextUtterance];
}

You’ve created an utterance to speak, and told the synthesizer to speak it.

Now add the following code just below speakNextUtterance

- (void)startSpeaking
{
  if (!self.synthesizer) {
    self.synthesizer = [[AVSpeechSynthesizer alloc] init];
  }
 
  [self speakNextUtterance];
}

This code initializes the synthesizer property if it’s not already initialized. Then it invokes speakNextUtterance to speak.

Add the following line of code to the very end of viewDidLoad, gotoNextPage and gotoPreviousPage

  [self startSpeaking];

Your additions ensure that speech starts when the book loads, as well as when the user advances to the next or previous page.

Build and run and listen to the dulcet tones of AVSpeechSyntesizer.

Note: If you don’t hear anything, check the volume on your Mac or iOS device (wherever you’re running the app). You might need to swipe between pages to start speech again if you missed it.

Also note: if you are running this project in the simulator, be prepared to have your console filled with cryptic error messages. This appears only to happen in the simulator. They will not print out when used on a device.

Once you’ve confirmed that you can hear speech, try building and running again, but this time, swipe from right-to-left before the first page finishes talking. What do you notice?

The synthesizer will start speaking the second page’s text once it’s completed the first page. That’s not what users will expect; they’ll expect that swiping to another page will stop speech for the current page and start it for the next page. This glitch isn’t so worrisome for short pages like nursery ryhmes, but imagine what could happen with very long pages…

Breaking Speech into Parts

One reliable principle of software engineering is to keep data and code separate. It makes testing your code easier, and it makes it easier to run your code on different input data. Moreover, keeping data out of code allows you to download new data at runtime. For example, wouldn’t it be grand if your book app could download new books at runtime?

You’re currently using a simple test book Book.testBook to test your code. You’re about to change that by storing books in and reading them from Apple’s plist (XML) format files.

Open Supporting Files\WhirlySquirrelly.plist and you’ll see something like the following

WhirlySquirrelly.plist

You can also see the raw data structure by Right-Clicking on Supporting Files\WhirlySquirrelly.plist and selecting Open As\Source Code.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>bookPages</key>
  <array>
    <!-- First page -->
    <dict>
      <key>backgroundImage</key>
      <string>PageBackgroundImage.jpg</string>
      <key>utterances</key>
      <array>
        <dict>
          <key>utteranceProperties</key>
          <dict>
            <key>pitchMultiplier</key>
            <real>1</real>
            <key>rate</key>
            <real>1.2</real>
          </dict>
          <key>utteranceString</key>
          <string>Whisky,</string>
        </dict>
        ...
      </array>
    </dict>
    <!-- Second page -->
    <dict>
      <key>backgroundImage</key>
      <string>PageBackgroundImage.jpg</string>
      <key>utterances</key>
      <array>
        <dict>
          <key>utteranceProperties</key>
          <dict>
            <key>pitchMultiplier</key>
            <real>1.2</real>
            <key>rate</key>
            <real>1.3</real>
          </dict>
          <key>utteranceString</key>
          <string>Whirly,</string>
        </dict>
        ...
      </array>
    </dict>
  </array>
</dict>
</plist>

It’s nice to have a high-level view of my data structures. The data structure in Supporting Files\WhirlySquirrelly.plist is outlined as follows (where {} indicates a dictionary and [] an array):

Book {
  bookPages => [
  	{FirstPage
                backgroundImage => "Name of background image file",
  		utterances => [
  			{ utteranceString     => "what to say first",
  			  utteranceProperties => { how to say it }
  			},
  			{ utteranceString     => "what to say next",
  			  utteranceProperties => { how to say it }
  			}
  		]
  	},
  	{SecondPage
                backgroundImage => "Name of background image file",
  		utterances => [
  			{ utteranceString     => "what to say last",
  			  utteranceProperties => { how to say it }
  			}
  		]
  	}
  ]
}

Behold the power of ASCII art! :]

Supporting Files\WhirlySquirrelly.plist breaks up the text into one-utterance-per-word. The virtue of doing this is that you can control the speech properties’ pitch (high voice or low voice) and rate (slow or fast-talking) for each word.

The reason your synthesizer sounds so mechanical, like a robot from a cheesy 1950′s sci-fi movie, is that its diction is too uniform. To make your synthesizer speak more like a human, you’ll need to control the pitch and meter, which will vary its diction.

Parsing Power

You’ll parse Supporting Files\WhirlySquirrelly.plist into a RWTBook object. Open RWTBook.h and add the following line right after the declaration of bookWithPages:

  + (instancetype)bookWithContentsOfFile:(NSString*)path;

This method will read a file like Supporting Files\WhirlySquirrelly.plist, then initialize and return a RWTBook instance that holds the file’s data.

Open RWTBook.m and add the following code right below #import "RWTPage.h"

#pragma mark - External Constants
 
NSString* const RWTBookAttributesKeyBookPages = @"bookPages";

This is the key you’ll use to retrieve the book’s pages from files like Supporting Files\WhirlySquirrelly.plist.

With RWTBook.m still open, add the following code at the bottom of the file, just before the @end.

 
#pragma mark - Private
 
+ (instancetype)bookWithContentsOfFile:(NSString*)path
{
  // 1
  NSDictionary *bookAttributes = [NSDictionary dictionaryWithContentsOfFile:path];
  if (!bookAttributes) {
    return nil;
  }
 
  // 2
  NSMutableArray *pages = [NSMutableArray arrayWithCapacity:2];
  for (NSDictionary *pageAttributes in [bookAttributes objectForKey:RWTBookAttributesKeyBookPages]) {
    RWTPage *page = [RWTPage pageWithAttributes:pageAttributes];
    if (page) {
      [pages addObject:page];
    }
  }
 
  // 3
  return [self bookWithPages:pages];
}

Here’s what your new code does:

  1. Reads and initializes a dictionary of book attributes from the given path. This is where your code reads Supporting Files\WhirlySquirrelly.plist.
  2. Creates a new Page object for each dictionary of page attributes under the book attributes.
  3. Returns a new book using the handy bookWithPages: provided in the starter project.

Open RWTPageViewController.m and navigate to viewDidLoad. Replace the line

  [self setupBook:[RWTBook testBook]];

with

  NSString *path = [[NSBundle mainBundle] pathForResource:@"WhirlySquirrelly" ofType:@"plist"];
  [self setupBook:[RWTBook bookWithContentsOfFile:path]];

Your new code locates WhirlySquirrelly.plist and creates a book from it by using bookWithContentsOfFile:.

You’re almost ready to run your new code. Open RWTPage.m and add the following code below the #import "RWTPage.h"

@import AVFoundation;

Now you can reference AVSpeechUtterance in this file.

Add the following constant definitions just below the definition of RWTPageAttributesKeyBackgroundImage

NSString* const RWTUtteranceAttributesKeyUtteranceString = @"utteranceString";
NSString* const RWTUtteranceAttributesKeyUtteranceProperties = @"utteranceProperties";

These are the keys you’ll use to parse out individual AVSpeechUtterance attributes from a plist.

Replace pageWithAttributes: with the following

+ (instancetype)pageWithAttributes:(NSDictionary*)attributes
{
  RWTPage *page = [[RWTPage alloc] init];
 
  if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSString class]]) {
    // 1
    page.displayText = [attributes objectForKey:RWTPageAttributesKeyUtterances];
    page.backgroundImage = [attributes objectForKey:RWTPageAttributesKeyBackgroundImage];
  } else if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSArray class]]) {
    // 2
    NSMutableArray *utterances = [NSMutableArray arrayWithCapacity:31];
    NSMutableString *displayText = [NSMutableString stringWithCapacity:101];
 
    // 3
    for (NSDictionary *utteranceAttributes in [attributes objectForKey:RWTPageAttributesKeyUtterances]) {
      // 4
      NSString *utteranceString =
                 [utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceString];
      NSDictionary *utteranceProperties =
                     [utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceProperties];
 
      // 5
      AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:utteranceString];
      // 6
      [utterance setValuesForKeysWithDictionary:utteranceProperties];
 
      if (utterance) {
        // 7
        [utterances addObject:utterance];
        [displayText appendString:utteranceString];
      }
    }
 
    // 8
    page.displayText = displayText;
    page.backgroundImage = [UIImage imageNamed:[attributes objectForKey:RWTPageAttributesKeyBackgroundImage]];
  }
 
  return page;
}

Here’s what your new code does:

  1. Handles the case like RWTBook.testBook where a page’s utterances are a single NSString. Sets the display text and background image.
  2. Handles the case like Supporting Files\WhirlySquirrelly.plist where a page’s utterances are an NSArray of NSDictionary. Accumulates all the utterances and display text.
  3. Loop over the individual utterances for the page.
  4. Grabs the individual utterance’s utteranceString and utteranceProperties.
  5. Create a new AVSpeechUtterance to speak utteranceString.
  6. Set the new utterance’s properties using Key Value Coding (KVC). Although not openly documented by Apple, AVSpeechUtterance responds to the selector setValuesForKeysWithDictionary: so you can use it to set all the utteranceProperties in one fell swoop. Conveniently, this means you can add new utterance properties to your plist without needing to write new setter invocation code; setValuesForKeysWithDictionary: will handle the new properties automatically. That is, of course, provided the corresponding properties exist on AVSpeechUtterance and are writable.
  7. Accumulate the utterance and display text.
  8. Set the display text and background image.

Build and run and listen to the speech.

You’ve constructed each RWTPage.displayText from the combined utteranceStrings for the page in the plist. So, your page view displays the entire page’s text.
However, remember that RWTPageViewController.speakNextUtterance creates a single AVSpeechUtterance for the entire RWTPage.displayText. The result is that it overlooks your carefully parsed utterance properties.

In order to modify how each utterance is spoken, you need to synthesize each page’s text as individual utterances. If only there were some way to observe and control how and when AVSpeechSynthesizer speaks. Hmmm…

Be a Good Delegate and Listen

Your speech synthesizer AVSpeechSynthesizer has a delegate AVSpeechSynthesizerDelegate that is informed of various important events and actions in the speech synthesizer’s lifecycle. You’ll implement some of these delegate methods to make speech sound more natural by using the utterance properties included in WhirlySquirrelly.plist.

Open RWTPage.h and add the following code after the declaration of displayText

  @property (nonatomic, strong, readonly) NSArray *utterances;

Open RWTPage.m and add the following code after the declaration of displayText

  @property (nonatomic, strong, readwrite) NSArray *utterances;

Note: You’re following a best-practice here by declaring properties in the header file as readonly and in the implementation file as readwrite. This makes sure that only object itself that can set its properties.

Replace pageWithAttribute: with the following code

+ (instancetype)pageWithAttributes:(NSDictionary*)attributes
{
  RWTPage *page = [[RWTPage alloc] init];
 
  if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSString class]]) {
    page.displayText = [attributes objectForKey:RWTPageAttributesKeyUtterances];
    page.backgroundImage = [attributes objectForKey:RWTPageAttributesKeyBackgroundImage];
    // 1
    page.utterances  = @[[[AVSpeechUtterance alloc] initWithString:page.displayText]];
  } else if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSArray class]]) {
    NSMutableArray *utterances = [NSMutableArray arrayWithCapacity:31];
    NSMutableString *displayText = [NSMutableString stringWithCapacity:101];
 
    for (NSDictionary *utteranceAttributes in [attributes objectForKey:RWTPageAttributesKeyUtterances]) {
      NSString *utteranceString =
                 [utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceString];
      NSDictionary *utteranceProperties =
                     [utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceProperties];
 
      AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:utteranceString];
      [utterance setValuesForKeysWithDictionary:utteranceProperties];
 
      if (utterance) {
        [utterances addObject:utterance];
        [displayText appendString:utteranceString];
      }
    }
 
    page.displayText = displayText;
    page.backgroundImage = [UIImage imageNamed:[attributes objectForKey:RWTPageAttributesKeyBackgroundImage]];
    // 2
    page.utterances  = [utterances copy];
  }
 
  return page;
}

The only new code is in sections 1 and 2, which set the page.utterances property for the NSString case and the same property for the NSArray case, respectively.

Open RWTPageViewController.h and replace its contents below the header comments with

#import <UIKit/UIKit.h>
@import AVFoundation;
 
// 1
@interface RWTPageViewController : UIViewController<AVSpeechSynthesizerDelegate>
 
@property (nonatomic, weak) IBOutlet UILabel *pageTextLabel;
@property (nonatomic, weak) IBOutlet UIImageView *pageImageView;
 
@end

In Section 1, you declared that RWTPageViewController conforms to the AVSpeechSynthesizerDelegate protocol.

Open RWTPageViewController.m and add the following property declaration just below the declaration of the synthesizer property

  @property (nonatomic, assign) NSUInteger nextSpeechIndex;

You’ll use this new property to track which element of RWTPage.utterances to speak next.

Replace setupForCurrentPage with

- (void)setupForCurrentPage
{
  self.pageTextLabel.text = [self currentPage].displayText;
  self.pageImageView.image = [self currentPage].backgroundImage;
  self.nextSpeechIndex = 0;
}

Replace speakNextUtterance with

- (void)speakNextUtterance
{
  // 1
  if (self.nextSpeechIndex < [[self currentPage].utterances count]) {
    // 2
    AVSpeechUtterance *utterance = [[self currentPage].utterances objectAtIndex:self.nextSpeechIndex];
    self.nextSpeechIndex    += 1;
 
    // 3
    [self.synthesizer speakUtterance:utterance];
  }
}
  1. In Section 1, you’re ensuring that nextSpeechUtterance is in range.
  2. At Section 2 you’re getting the current utterance and advancing the index.
  3. Finally, in Section 3, you’re speaking the utterance.

Build and run.What happens now? You should only hear “Whisky,” the first word, spoken on each page. That’s because you still need to implement some AVSpeechSynthesizerDelegate methods to queue up the next utterance for speech when the synthesizer finishes speaking the current utterance.

Replace startSpeaking with

- (void)startSpeaking
{
  if (!self.synthesizer) {
    self.synthesizer = [[AVSpeechSynthesizer alloc] init];
    // 1
    self.synthesizer.delegate = self;
  }
 
  [self speakNextUtterance];
}

In Section 1, you’ve made your view controller a delegate of your synthesizer.

Add the following code at the end of RWTPageViewController.m, just before the @end

 
#pragma mark - AVSpeechSynthesizerDelegate Protocol
 
- (void)speechSynthesizer:(AVSpeechSynthesizer*)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance*)utterance
{
  NSUInteger indexOfUtterance = [[self currentPage].utterances indexOfObject:utterance];
  if (indexOfUtterance == NSNotFound) {
    return;
  }
 
  [self speakNextUtterance];
}

Your new code queues up the next utterance when the synthesizer finishes speaking the current utterance.

Build and run. You’ll now hear a couple of differences:

  • You queue up the next utterance when the current utterance is spoken, so that every word on a page is verbalize.
  • When you swipe to the next or previous page, the current page’s text is no longer spoken.
  • Speech sounds much more natural, thanks to the utteranceProperties in Supporting Files\WhirlySquirrelly.plist. Your humble tutorial author toiled over these to hand-tune the speech.

Control: You Must Learn Control

Master Yoda was wise: control is important. Now that your book speaks each utterance individually, you’re going to add buttons to your UI so you can make real time adjustments to the pitch and rate of your synthesizer’s speech.

Still in RWTPageViewController.m, add the following property declarations right after the declaration of the nextSpeechIndex property

@property (nonatomic, assign) float currentPitchMultiplier;
@property (nonatomic, assign) float currentRate;

To set these new properties, add the following methods right after the body of gotoPreviousPage:

- (void)lowerPitch
{
  if (self.currentPitchMultiplier > 0.5f) {
    self.currentPitchMultiplier = MAX(self.currentPitchMultiplier * 0.8f, 0.5f);
  }
}
 
- (void)raisePitch
{
  if (self.currentPitchMultiplier < 2.0f) {
    self.currentPitchMultiplier = MIN(self.currentPitchMultiplier * 1.2f, 2.0f);
  }
}
 
- (void)lowerRate
{
  if (self.currentRate > AVSpeechUtteranceMinimumSpeechRate) {
    self.currentRate = MAX(self.currentRate * 0.8f, AVSpeechUtteranceMinimumSpeechRate);
  }
}
 
- (void)raiseRate
{
  if (self.currentRate < AVSpeechUtteranceMaximumSpeechRate) {
    self.currentRate = MIN(self.currentRate * 1.2f, AVSpeechUtteranceMaximumSpeechRate);
  }
}
 
-(void) speakAgain
{
    if (self.nextSpeechIndex == [[self currentPage].utterances count]) {
      self.nextSpeechIndex = 0;
      [self speakNextUtterance];
    }
}

These methods are the actions that connect to your speech control buttons.

  • lowerPitch: and raisePitch: lower and raise the speech pitch, respectively, by up to 20% for each invocation, within the range [0.5f, 2.0f].
  • lowerRate: and raiseRate" lower and raise the speech rate, respectively, by up to 20% for each invocation, within the range [AVSpeechUtteranceMinimumSpeechRate, AVSpeechUtteranceMaximumSpeechRate].
  • speakAgain: resets the internal index of the current spoken word, then repeats the message on the screen.

Create the buttons by adding the following methods right after the body of raiseRate

-(void) addSpeechControlWithFrame: (CGRect) frame title:(NSString *) title action:(SEL) selector {
  UIButton *controlButton = [UIButton buttonWithType:UIButtonTypeRoundedRect];
  controlButton.frame = frame;
  controlButton.backgroundColor = [UIColor colorWithWhite:0.9f alpha:1.0f];
  [controlButton setTitle:title forState:UIControlStateNormal];
  [controlButton addTarget:self
                 action:selector
       forControlEvents:UIControlEventTouchUpInside];
  [self.view addSubview:controlButton];
}
 
- (void)addSpeechControls
{
  [self addSpeechControlWithFrame:CGRectMake(52, 485, 150, 50) 
                            title:@"Lower Pitch" 
                           action:@selector(lowerPitch)];
  [self addSpeechControlWithFrame:CGRectMake(222, 485, 150, 50) 
                            title:@"Raise Pitch" 
                           action:@selector(raisePitch)];
  [self addSpeechControlWithFrame:CGRectMake(422, 485, 150, 50) 
                            title:@"Lower Rate" 
                           action:@selector(lowerRate)];
  [self addSpeechControlWithFrame:CGRectMake(592, 485, 150, 50) 
                            title:@"Raise Rate" 
                           action:@selector(raiseRate)];
  [self addSpeechControlWithFrame:CGRectMake(506, 555, 150, 50) 
                            title:@"Speak Again" 
                           action:@selector(speakAgain)];
 
}

addSpeechControlWithFrame: is a convenience method to add buttons to the view that links each of them with methods to alter the spoken text on demand.

Note: You could also create these buttons in Main.storyboard and wire up their actions in RWTPageViewController. But that would be too easy, and there is a more functional approach.

Add the following code in viewDidLoad before [self startSpeaking]:

 
  // 1
  self.currentPitchMultiplier = 1.0f;
  self.currentRate = AVSpeechUtteranceDefaultSpeechRate;
 
  // 2
  [self addSpeechControls];

Section 1 sets your new speech properties to default values, and section 2 adds your speech controls.

As the last step, replace speakNextUtterance with the following

- (void)speakNextUtterance
{
  if (self.nextSpeechIndex < [[self currentPage].utterances count]) {
    AVSpeechUtterance *utterance = [[self currentPage].utterances objectAtIndex:self.nextSpeechIndex];
    self.nextSpeechIndex    += 1;
 
    // 1
    utterance.pitchMultiplier = self.currentPitchMultiplier;
    // 2
    utterance.rate = self.currentRate;
 
    [self.synthesizer speakUtterance:utterance];
  }
}

The new code sets the pitchMultiplier and rate of the next utterance to the values you set while clicking the nifty new lower/raise buttons.

Build and run. You should see something like below.

Narrated Book with Speech Control

Try clicking or tapping the various buttons while it’s speaking, and take note of how it changes the sound of the speech. Yoda would be proud; you’re not a Jedi yet, but are becoming a master of AVSpeechSynthesizer.

Where to Go From Here?

Here is the final Completed Project for this tutorial.

I encourage you to experiment with your narrated book app. If you want a few ideas for how to tweak your app, here are some:

Compete for the Most Natural Sounding WhirlySquirrelly.plist

Hint: Try your hand at fine tuning WhirlySquirrelly.plist and upload your most-natural-sounding version of the plist to the comments/forum. We’ll judge the winner and give him or her praise in the comments.

Add the Ability for the User to Select Which Book They Read and Hear

Hint: Add a “Choose Book” button to your UI that displays a UIPopoverController with a list of alternate books. Tapping on a book in this list should reset the book in your RWTPageViewController and present the new book.

Add the Ability to Download New Books from a Webserver

Hint: Store your books as plists on your own webserver or a service like AWS S3 or Heroku. Create a webservice call to list the urls of all available books and another webservice call to fetch a single book. Link this with the book, choosing functionality you added in the previous item above.

Add Word Highlighting the Corresponds with the Speech

Hint: Use the AVSpeechSynthesizerDelegate method speechSynthesizer:didStartSpeechUtterance: to highlight the passed utterance, and speechSynthesizer:didFinishSpeechUtterance: to unhighlight it. You should use the pageTextLabel.attributedText property to set the entire text, using an NSAttributedString to add different foreground color and font properties to the highlighted and unhighlighted sections of the text.

Add the Ability to Display a Title Page Before All Other Pages

Hint: Add another view controller before your RWTPageViewController. You’ll have to place both view controllers in a UINavigationController and update your Main.storyboard file to use the new view controllers. Alternately, you could redesign the Page class hierarchy into speakable and non-speakable pages, and modify RWTPageViewController to handle different page types appropriately.

How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7 is a post from: Ray Wenderlich

The post How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7 appeared first on Ray Wenderlich.

6
Like
Save

Comments

Write a comment

*