Evaluate and optimize voice interaction for your app
Optimize your app for Siri and give people a more natural way to interact with the features of your app. We'll compare the different Siri technologies and help you identify the right one for you and your needs, show you how to get started with building for conversational interactions, and explore best practices for making your integration truly excel.
I'm a conversational interaction designer, which is a fancy way of saying that I think about the kinds of things people should be able to do with Siri. I'm going to talk about these two things today.
So first, let's talk about what you can do with Siri. To get at the heart of it, let's first dig into what we at Apple mean when we refer to "Siri" in the first place. You may be familiar with this bit from the Siri Human Interface Guidelines.
Siri makes it easy for people to accomplish their everyday tasks quickly using Voice, Touch or Automation.
This means that Siri should be equally as powerful and enjoyable to use across all these modalities for all supported tasks.
As a conversation designer, I, of course, want to focus today on Voice. There are many parts of the overall Voice experience, including the assistant, Intents, Shortcuts, Suggestions, Automations. Siri is a lot of things. First, let's look at Siri, the conversational assistant available on your Apple devices. Siri helps you do all kinds of things, including get information, like the weather, an answer to a math problem or hours of a local business. Complete tasks, like calling a loved one, setting a calendar event or checking important Reminders. And pass the time with all kinds of conversation from asking for jokes to hearing a story.
All of us at Siri work hard to make a great assistant, but there are ways that you can help it become even better.
Let's look at what you, as a developer, can do with Siri.
It all starts with intents. But what even are intents? Conceptually, intents are how we a speak a common language. That definitely sounds like something we want to do, but what does it mean in practice? Let's take a look at some examples.
An intent helps answer this question: What is sending a message? Well, think about what is actually involved when a message is sent. We need to know who you want to send the message to. And we definitely need to know what you want to say in your message.
We've defined really powerful natural language support for some key experiences in what we call domains, that group related intents to do the things people do every day.
You'll see these groups referred to as System Intents.
These intents come with the conversational flow that's needed to complete a request. Do you need to ask what music to play or say a message has been sent? Comes with. Just add your data.
Of course, there are still some things you should think about from the design perspective here.
While most of what Siri will say has been written for you, you should still make sure that the dialogue sounds good with your data in it, both from a sentence structure perspective and a text-to-speech perspective. And error cases are always really important, so that people understand what's happened, which means you should pay special attention here and make sure that the errors or unsupported cases in your app map to the dialogue that Siri provides.
But what about other stuff, like what is it to order coffee? Well, you tell us. You're the expert on your use case. That's where custom intents come into play. With custom intents, the conversational flow isn't provided. But as I said, you're the expert on your use case, so that puts you in an excellent position to create this flow.
Just think through the logic of your use case. What steps does someone need to take? What information do they need to provide? And how would you talk them through it? What words do you use, what order do you ask for things? In addition to intents, we should also talk about Shortcuts.
Shortcuts is an umbrella term for a lot of functionality. Shortcuts are a great tool for developers to make their functionality even easier for people to access. They also empower people to create their own routines for things they like to do and then share those routines with others. Under the Shortcuts umbrella, there's Suggestions. Siri can suggest a Shortcut for an action someone has done at least one time. It can surface these suggestions in a variety of ways, making for a more personalized Siri experience and a quick and easy way to access your app's functionality. Siri can also suggest Shortcuts for actions people haven't done yet, but might want to do, based on what they've done before.
The Shortcuts app. In the app, people can find premade Shortcuts for all kinds of actions, both fun and useful. They can also create Automations, which can run directly on a person's device or can be used by everyone at Home. Just set up the right conditions to run the actions, like time of day, location or when a specific event happens. Last, but certainly not least, Voice.
These Shortcuts are powered by the custom intents we talked about earlier. They enable people to access your in-app functionality without having to open the app and allow people to add actions from your app into Automations that they may want to set up. Now that we've talked about the ways you can add to Siri, let's turn our attention to what to add to Siri. Voice shouldn't just be about filling out forms. That can be useful, but speaking answers one by one to a bunch of fields isn't what makes a great voice interaction.
Voice is great for simplifying multistep interactions, especially ones that are completed frequently. Think about playing music. If you want to play your favorite artist's newest album, you first need to open the app. Then you either need to know the name of the album or search through the band's whole catalog to find it. At last, you can play it.
With Voice, you can say, "Play the latest album by DaBaby," and let the app do the heavy lifting of figuring out what it's supposed to play. It's less steps, which is great, but there's also less required from people to accomplish their goal.
Sometimes, apps are laid out with several steps before the actual task people want to complete. Imagine our old friend Soup Chef. You have to look at the menu before you can select a soup, even if you know exactly what you want to order. That in-app hierarchy is important to help people who are new to your app know what there is to do. But people who order your soup all the time don't need to see the menu to know they want to order the tomato bisque. Voice accelerates them past the information they no longer need and gets them right to where they want to be-- ordering a delicious bowl of soup.
And maybe you have some power users of your app who frequently access something that's not prominently featured in your UI, like Settings.
This is a great fit for Voice. People who love that feature will be able to access it quickly. And remember that people can use their voice even when they're driving, walking with AirPods or otherwise unable to get to your app on their phone. Voice is great for things people want to do when they're multitasking.
In addition to having a great feature set, you also want to ensure your experience has great conversation. Taking advantage of available modalities is a great step towards natural conversation. Let's take a quick look at the two modes Siri can use to communicate.
First up, silent mode.
In iOS 14, if the iPhone ringer switch is muted, Siri will be in silent mode by default. As implied by the name, Siri doesn't speak in silent mode. Instead, the screen UI and, where relevant, printed dialogue work together to present all the information people need to complete their task.
And then there's Voice mode. Here, while an available screen will still show UI, Siri speaks out all the information people need to complete the task.
Siri:Your message to Eden says, "Want to get coffee later?" Ready to send it? How you adopt these modes depends on what solution you've implemented. System intents have a lot of functionality baked into them, including handling these different interaction modes. You don't have to do anything to get this working for people.
Custom intents do require a little work to support this, but the tooling makes it easy. Just indicate what dialogue and UI you want Siri to use when people are looking at the screen and what you want to use when they aren't. Siri will ensure the right pieces are used at the right time.
So, now that you know the modes Siri can communicate in, let's talk about what Siri actually says. One of the things the Siri team works on is ensuring Siri doesn't speak more than it needs to. So, where appropriate, Siri will use shorter dialogue once people are familiar with a piece of functionality. Let's take a look at an example.
Here, Siri provides more information the first time you ask to change a setting.
The second time, people don't need the same detail.
One of the places where Siri does this is in attribution, or letting people know where information is coming from.
If you build a custom intent and you do your job well, people will use your shortcut a lot. In that case, Siri will stop including the attribution for your shortcut when people access it. But that means you have a responsibility to have conversational dialogue.
And so, here are some things to keep in mind to help you do that. Questions are a great conversational marker. They let people know you're expecting an answer, so be sure to use them when people need to provide information to continue.
Try to avoid jargon in your dialogue. You're an expert on your use case, but other people might not be.
Use best practices for spoken English, not written English. This keeps it feeling, well, conversational. And as much as possible, keep dialogue short. It's hard for people to remember everything they hear.
Thanks for watching and enjoy WWDC.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.