The risk of a bad user experience in voice is higher than for screen. Unless you invest more in a custom solution through a smaller company, then you need to work within the limitations of Echo and Google Home. The tolerance for a bad experience is also much lower for voice users - they are likely to be less familiar with voice than they are with screen, but also expect it to be the quicker, easier option, making your window for success smaller.
To give your skill the best chance of succeeding, be sure to do the following -
Use natural language - write for conversation, and embrace long-form
While some might bark orders at their phone, the best experience is one that feels conversational. To create that you need to use copywriters, and you need to do your research. Look into any forums related to your industry and observe how your audience are using language.
This also has implications for SEO, which is now rewarding natural language and longer-form content.
Determine every possible iteration of your skill request, develop a response to match
Did you know there are over 40,000 ways to request a flight? They include airlines, times, days, locations, and then mannerisms and phrasings for each combination. You need to identify each of the iterations for your skill, and develop responses to match. Nothing turns a user off quicker than, ‘Sorry, something went wrong. Please repeat your request.’
...But also try to do some of thinking
It is a monstrous, and largely manual task identifying the thousands, and thousands, and thousands, and thousands, of different ways different people can ask the same question, so chances are high you may miss one or two. Create shortcuts that allow your skill to fill in the blanks, rather that failing to progress unless it has a 100% match to something it recognises.
You can do this through account-linking and assumptions. Account-linking may help to pull in some of the users basic data - full name, birth date etc - without having to ask too many follow-up questions. These follow-ups can make the experience feel more like a phone conversation and the point-of-difference (convenience) is lost.
Assumptions can be built in to the AI engine to assist the skill. For example; if a user asks for session times for a movie but does not state the day, you can assume it is for that day and respond with those times, rather than asking.
Don’t use Google Translate
The nuances of language and translation are a key challenge to taking a skill to audiences in different countries, with different languages. To capture these, it is best to employ a team on the ground in each of the countries you wish to target - translation is not a simple conversion.
Use your manners
The unique freedom that comes from interacting with a screen, rather than a person, is why we have keyboard warriors. The same can apply to voice interactions. It’s important to include polite responses as part of your output not only for these users, but for older generations who will speak to technology as if they were a person sitting across from them. Programming greetings, ‘thank yous’ and ‘pleases’ will help to personalise the experience for your audience.
Avoid a branded voice, for now
For around fifty thousand dollars, brands can get their own ‘voice’. There are benefits to having your own - it can mirror your brand more closely, and appeal to your specific audience. However, because of the limitations on the technology the risks of errors, and a bad experience, are high and you may find that the voice you have invested in, becomes associated with failure. So for now, let Alexa take the heat.
Remember that you’re catering to different senses and skills
Voice is a whole new experience compared to screen, using entirely different human skills and senses - speech and hearing, rather than sight. Just as you can’t directly translate into another language without consulting cultural norms and mannerism, you can’t do a direct translation of your screen content to voice. Imagine trying to maintain the attention of your user while Alexa reads out a full web page? Consider the best applications of voice for your users needs and develop an experience tailored to help fulfil and achieve these.
These tips came from our recent event, Utilising Voice Technologies: Why type when you can talk.