Virtual automation and virtual assistants are commonplace now, with Cortana, Siri, Google Now, and all of the GPS and audio book options. I have heard people complain about how bad the virtual voices are, or how computerized they sound. I started to wonder if the voice behind virtual assistance matters as much as the technology that it supports. If you have a virtual assistant that can essentially help you with everything you can think to ask it, do you throw in the towel because you don’t like the voice?
Top Complaints About Virtual Assistant Voices:
- The voice sounds like a robot
- There are mostly female voices
- They mispronounce easy words
- The voices don’t use local dialect or accents
- The voices have too much of an accent
- I just don’t like it
While some of these seems unfounded or silly, you have to consider how sound connects to our brain. It’s no different than listening to a song and hating it because of the lead singer’s voice. Think about how a smell can trigger a memory; sounds are similar. If your smart phone assistant’s voice sounds like your ex or like your annoying sibling, you aren’t going to like it, no matter how accurate it returns results. Think about the voice that you hate the most in the world, then imagine it giving you directions every day, or telling you where the nearest Starbucks is located. You twitched a little didn’t you? It’s a pretty daunting task for virtual assistants to use a voice that can be more or less universally accepted; so they use a female voice that has likeable characteristics.
How is a virtual assistant built?
This video from an article on The Verge dives into how Siri’s responses were created, and it’s pretty fascinating. It talks about how they have the voice actors record different sounds and parts of words that are common, so that they can build responses. It’s impossible to predict all the things that a virtual assistant will ever say, so they have to use pieces of words to create complete words. This explains why the responses can sound so choppy. Technology is getting more advanced every day, so I’m sure we’ll hear more human voices soon enough.
The Good, the Bad, and the Ugly of Virtual Voices
All in all, I believe that the convenience of virtual assistance outweighs the nuances of the voice technology. Sure, it’s robotic – but it is a computer talking to you, not a human. If you need a strictly human voice, well they make human assistants for that. If you are going virtual, you have to expect that you are going to have computer brains behind the curtain.
In a Perfect World…
In a perfect world, I’d love to have the option of building my own voice for my virtual assistant. Wouldn’t it be fun to use your kid’s voice, or a celebrity’s voice, or some combination of the two, to make your own assistant. As the virtual of things works more and more toward catering to a person’s personal tastes and preferences, it seems more likely that we’ll have DIY voices instead of choosing from a list of female vs. male, and American vs. British. What is even more likely than that, though, is the advancement of the AI language database. In my lifetime, I expect the robotic-ness to be removed from the robots, and replaced with a more human-sounding voice.
Photo by: Gareth Simpson