Léonie Watson will be one of the speakers at WebExpo 2023. Her talk is titled More than words: Designing and building voice interfaces. In this talk, Léonie will explore voice character and design, conversational user experience, APIs for generating synthetic speech in the browser (and in the cloud), techniques for manipulating voice output, and yes, the importance of choosing the right words – all with examples to bring it all to life!
I used this opportunity and asked her a few questions about this topic.
Firstly, let me briefly introduce Léonie. Léonie is the director of TetraLogical, a member of the W3C Board of Directors, and a co-chair of the W3C WebApps Working Group. She worked in tech support in the 90s and taught herself HTML/CSS/JS to stop getting bored. By the time the “DotCom” bubble was at its height, Léonie was working as a web designer, and despite losing her sight in Y2K, she’s had an extraordinary amount of fun as an accessibility engineer ever since.
Radek: Léonie, how does a voice interface differ from a graphical user interface in terms of design and user experience?
Léonie: We’re designing for a different paradigm. Instead of thinking about colours, typography, and graphics, we’re thinking about voice, pronunciation, and words. Think about the difference between looking at a painting and listening to a piece of music.
There are similarities too. Structure, architecture, and the quality of the written content are important to both the voice UI and the graphical UI.
Radek: What are some of the unique design considerations that need to be taken into account when creating voice interfaces, particularly for users with disabilities?
Léonie: For people who are Deaf and who cannot hear the voice UI, it’s important that there is an alternative way to consume the same content. The voice might be a different way of consuming existing text content (like a web reader in the browser), or it may be necessary to display captions as an alternative to the voice UI.
Otherwise, whether or not someone has a disability doesn’t really matter. The important things are to make sure the voice can be understood by the target audience, that the speech rate is reasonable for a general audience, and that the content makes sense when it’s spoken rather than written.
It’s also important to make sure the voice characteristics can be customised by the user to suit their preferences – the ability to speed up or slow down the speaking rate, choose a different voice, or turn off the voice UI completely for example.
Radek: How do you approach user testing and feedback when designing voice interfaces?
Léonie: Exactly the same way you approach usability testing for any other product. You choose your participants, making sure to include people with disabilities amongst the larger group of course, and ask them to complete tasks or user journeys through the voice UI.
Radek: Can you provide examples of industries or applications where voice interfaces have been particularly successful, and what contributed to their success?
Léonie: Alexa and Siri are probably the best examples of successful voice UI. The Echo in particular because it’s possible to design, develop, and distribute skills – just like we design, develop, and distribute websites, web apps, apps, and applications.
It’s more difficult to think of good examples on the web because support for voice UI design and development isn’t good enough yet. That’s something I’m hoping to change though!
Radek: Looking ahead, what do you see as the future of voice interfaces? How do you see this technology evolving to better serve people with disabilities, and what new opportunities do you think it will create for designers and developers?
Léonie: We use voice UI in almost every other respect – it’s on our smartphones, on our laptops, in our houses, and even our workplaces. It’s usually quicker to say something than it is to type it or write it down, and besides, humans have been talking to each other for thousands of years, so there are no new skills to learn.
Speech has been used by people who cannot use a mouse, keyboard, or touchscreen, for decades now, so voice UI is really nothing new.
The voice generation market is growing rapidly, with AI being used to generate artificial voices that are incredibly realistic – even ones that are cloned from real voices.
On the web we just need to convince the browser companies to give us better ways to bring voice UI capability to websites and web apps.
Radek: Why should WebExpo participants join your talk?
Léonie:> To find out what good voice UI design sounds like, to hear demonstrations of the latest in AI generated artificial voices, to learn how to code using different voice UI technologies, to find out about CSS Speech and join the growing numbers of people asking browsers to make it available on the web.
Léonie, thank you very much for the interview and I am looking forward to your talk at WebExpo 2023!
For those who would like to join Léonie and other amazing speakers, there is a coupon code “poslepu“ for 20 % off the ticket price.