Voice assistants are a part of everyday life, and they’re here to stay. Juniper Research recently released a report predicting that by 2023 there will be 8 billion digital voice assistants in use, tripling the estimated 2.5 billion voice assistants in use at the end of 2018.
Even with this ubiquity, voice interactions have stayed relatively confined to a few core use cases, such as search, playing music, and the smart home. For voice to achieve its potential and truly change how users interact with, and move throughout, the world, it needs designers. After all, voice assistants aren’t people. To be effective, a voice interface needs to be intentionally designed. And so it falls to the creative community to take the reins of a voice-first future.
But voice design doesn’t have the deep academic tradition of many other design disciplines, such as graphic design and architecture. Few design schools offer rigorous training in voice UI. So what can designers do to get their foot in the door with this important new medium? As the founder of voice design startup Sayspring and now leading voice UX for Adobe, I’ve been involved in countless voice projects. Here are my top four suggestions.
Do your homework
It would be hard to design the next great mobile app if you didn’t own a smartphone, and voice is no exception. Smart speakers have gotten so inexpensive, you can buy both an Amazon Echo Dot and Google Home Mini for less than $100. Use them both and get familiar with using voice as a primary interface. Note the different decisions both platforms have made and think about how that impacts designing user experiences for each.
For example, Amazon Alexa refers to third-party applications as Skills, and has adopted an “assistant” metaphor. Companies can give Alexa a new “skill,” and the experience is delivered in Alexa’s voice.
Google Assistant, on the other hand, has used an “operator” metaphor for third-party apps, calling them Actions. Google Assistant hands off the user experience to a third-party app, and that experience is delivered in a different voice.
Whether a user is talking to a brand directly or the voice service is acting as a middleman has massive implications for creating effective user experiences. Designers should be aware of and design for those differences. Case in point: Because Google Assistant requires third-party apps to use a different voice, there is an explicit transition at the start of the app, and the user is more aware they are actively using an app. This may help users of a banking app understand when their sensitive financial information may be overheard by someone else in the room. On Amazon Alexa, the third-party experience is more seamless, which may leave users unclear as to whether they’re speaking with Alexa or their bank, or worse, if someone else in the room may be speaking to their bank. A designer may want to include a distinct audio signature, such as a welcome jingle, to better communicate to users they are now speaking to their banking app.
It’s a good idea to try several different third-party apps for these platforms, to see how other designers have approached voice. What is it like to check a bank balance, order movie tickets, or play a game with your voice? Which companies have designed an intuitive, easy-to-use experience, and which have completely missed the mark? Similar to how designers spend hours browsing gallery sites for visual design inspiration, take a look at the tens of thousands Skills and Actions to find some voice design inspiration.
Start with a script
When kicking off a project involving a visual interface, a designer’s first stop is usually a blank whiteboard to sketch out some initial rough concepts of the user experience. But without a screen, where should you start? With a simple conversation.
With a blank sheet of paper or Word doc, just write out a basic back and forth between the user and voice app. What will the user say to get started? Will they know what they want to do, or will you need to present some options? What is the tone and personality of the experience?
Voice is a new medium to most designers, and it can be daunting to figure out a process to determine what works and what doesn’t. When in doubt, ask yourself what you would do for a visual design project, then think of how best to tweak it for voice.
Here’s a quick tip: Too many voice experiences attempt to be casual and cute (possibly rooted in the branding around “bots” in general). If you find yourself using exclamation points, ask yourself why. If you’re designing for say, a bank, you’re probably on the wrong track.
Get the conversation started
Now that you’ve iterated on your script a few times, it’s time to start building this voice app, right? Not so fast.
The craft of user experience design has benefited from the explosion of prototyping tools that have become available to designers recently. Creating and sharing something interactive, that a user or stakeholder can actually use, is an invaluable way to get early feedback before moving on to the costly step of development.
For a low-fi interactive experience, there is a method of testing called Wizard of Oz prototyping (or WoZ). Inspired by “the man behind the curtain” in the original movie, WoZ can start with the simple step of having two people sit across from each other at a table, one serving as the user and the other role-playing as the voice interface. This process will help the designer better understand what the user will say and how they will respond to the voice interface, helping to iterate on the script to deliver an improved experience.
As the user experience is refined, the interactive prototyping process can increase in fidelity. Using voice-enabled design tools, or even recordings from a text-to-speech service, can make the experience feel closer to the real thing, as does playing them through a Bluetooth-connected speaker.
The more time a designer spends in the prototyping process, the faster the development process will go, and the more effective the user experience will be, so don’t rush it.
Learn the lingo
When working with voice interfaces and platforms, there are a few helpful terms to know. More than just jargon, understanding the concepts behind these terms and why they’re a part of the voice world is important in learning the nuances of voice and audio.
An utterance is the command or phrase that a user says to a voice interface. Within a project there may be unique utterances that each have a specific intent, or synonyms to cover the different ways users may say the same thing.
Intents are the actions with which utterances are associated. They will usually have names like Stop.Intent, or BookaTrip.Intent. It’s telling that the starting point for building a voice app is literally, What is the intent of the user?
A prompt is the response that a voice system will say back to a user. In a multistep interaction, the voice interface is responsible for driving the conversation forward and will “prompt” the user accordingly. In other words, it’s the instruction or response that a system “speaks” to the user.
A turn is used to describe what the user says and how the system responds. Asking for the weather, and a voice assistant then responding, would be considered one turn.
A barge-in is when a user can interrupt a prompt while it’s still being played. This action has its own term because it’s common that a poorly designed voice system will respond with the wrong information, and a user will want to correct the system before waiting for the prompt to finish.
Finding the opportunities in voice
Want to find a voice design role? Not surprisingly, technology companies such as Amazon, Google, Microsoft, and even Facebook frequently have voice design positions on their job boards. Another option is to look at which companies and brands are launching Skills and Actions. Many companies that may seem non-obvious, such as insurance providers or financial institutions, are building out internal voice teams. Others have their apps built by creative agencies, which themselves may be hiring voice designers.
There’s an old adage that a designer’s portfolio should not only reflect the kind of work they’ve done, but also the kind of work they’re looking for. Whether you succeed in getting a voice project started in your current role, or it’s still being explored on the side, make sure to document projects, prototypes, and even explorations in your portfolio. Companies are having trouble finding designers with voice experience, so make sure your work is properly showcased online for them to find.
With voice poised to become as big as touch screens, voice design skills will be as critical to design jobs as digital branding is today. Designers need to start doing the above whether it’s at work, or on the side, so they’re ready.
Mark Webster is director of product at Adobe, focusing on voice integration for Adobe XD. He is also responsible for driving product strategy for emerging technologies within XD. Mark joined Adobe through the acquisition of the company he founded, Sayspring, which offers a design and prototyping platform for voice interfaces.