Amazon knows that Alexa doesn’t get everything right, so the virtual personal assistant is picking up a new trick later this year: Guessing when you’re frustrated.
The feature represents a fundamental shift in how Alexa understands the people talking to it. A conversation with Rohit Prasad, vice president and chief scientist for Alexa, reveals that the virtual assistant now analyzes not just what you’re saying, but your tone of voice when telling Alexa it got a command wrong. Rather than trying to understand what you said, Alexa also analyzes how you say it.
“As customers continue to use Alexa more often, they want her to be more conversational and can get frustrated when Alexa gets something wrong,” Amazon wrote in a blog post announcing the feature Wednesday. “When she recognizes you’re frustrated with her, Alexa can now try to adjust, just like you or I would do.”
The feature will be limited at first. It’s initially rolling out only for music requests. If you ask Alexa to play a song and it starts the wrong one, you can say “No, Alexa” and it will apologize and ask for a clarification.
Prasad tells OneZero that the update is similar to a feature that rolled out a few months ago, which prompts Alexa to respond quietly if a command is whispered to it — perfect for setting an alarm without waking your sleeping partner.
When you ask Alexa a question, two additional deep neural networks will analyze your voice.
“It’s magical when you walk in late to your bedroom and your wife is asleep,” Prasad says. “Of course, the cost of a mistake is very high there.”
Frustration detection takes the technology a step further. It’s not just a matter of determining how loud you’re speaking, but guessing the mental state behind your words.
Prasad shared a little bit about how the system works under the hood. When you ask Alexa a question, two additional deep neural networks will analyze your voice, beyond the ones used to decipher your command. One will try to detect words that indicate frustrations — the most obvious one is “no” — while the other will analyze the tone of your voice.
“The tonality is a big feature. You could have said yes [in response to Alexa], but be sarcastic, right?” says Prasad.
Based on what those two algorithms determine about your tone of voice and the words themselves, a third algorithm will then make the final call on whether the interaction was satisfactory or not.
There’s much more that can be done by analyzing voice, especially around health. Startups like Winterlight Labs in Toronto are using audio clips of people speaking to determine signs of dementia and other mental illnesses. Alexa speech scientist Viktor Rozgic nodded to health monitoring in an Amazon blog post earlier this year.
“Emotion recognition has a wide range of applications: it can aid in health monitoring; it can make conversational-A.I. systems more engaging; and it can provide implicit customer feedback that could help voice agents like Alexa learn from their mistakes,” Rozgic wrote.
When asked about the health uses for this kind of feature, Prasad mentioned that he was familiar with this kind of work, doing some of it himself in collaboration with DARPA, but said there was nothing on Amazon’s roadmap that he could share.
Amazon has been working on developing A.I that could understand a person’s tone for years. In 2016, MIT Tech Review wrote that emotion detection was an active area of work for Amazon’s Alexa team.
The update also adds a new wrinkle to Amazon’s continued insistence on gendering Alexa as female. Unlike tech companies such as Apple and Google, which allow you to make Siri or Google Assistant male, Amazon has kept the assistant’s persona as female, repeatedly referring to the technology as “she” in its presentation on Wednesday.
Gendering A.I. software invites tough questions for the tech company, like what response Alexa gives when it’s sexually harassed, or whether the virtual assistant might reinforce sexist stereotypes of how women should act in the workplace.
“Alexa’s passive responses to sexual harassment helped perpetrate a sexist expectation of women in service roles: that they ought to be docile and self-effacing, never defiant or political, even when explicitly demeaned,” wrote Quartz’s Leah Fessler when Amazon rewrote Alexa’s responses to sexual harassment.
“This feature is practically a cry for help for Amazon to hire humanists and workers who understand the ripple effects of social engineering like this.”
The virtual assistant had previously responded to insults like “you’re a bitch” and “you’re a slut” with “well, thanks for the feedback,” though after a series of articles including Fessler’s, the Alexa team rewrote the script to include a “disengage mode” where Alexa would now say “I’m not going to respond to that.”
Now, the female virtual assistant will apologize when it gets something wrong.
“This ‘submissiveness in the face of anger’ feature in a feminine-voiced digital assistant strengthens so many of the dangerous gender stereotypes and gendered power structures we’ve been trying to break down for decades, if not centuries,” says Mar Hicks, a professor at the Illinois Institute of Technology who studies technology and gender. “But I also think this is a good example of why it’s not enough to just have technically skilled people building social tech: this feature is practically a cry for help for Amazon to hire humanists and workers who understand the ripple effects of social engineering like this.”
Hicks says that it’s not enough for technologists like those behind Alexa to simply make something that works; they need to be aware of who the technology is actually meant to serve.
From Amazon’s perspective, Alexa has to play a number of roles depending on the context in which its used. Some of those, like acting as a companion for elderly users, might require more patience.
“Customers come in very different flavors, and how they use Alexa is very different,” Prasad says. “Relationships vary from assistant to companion to advisor.”
It’s still being decided how Alexa will play each of these roles. A human caregiver might make a sarcastic joke because they have a rapport with the person they’re working with or deal with frustration with levity, but that might feel out of place for Alexa. Or maybe it wouldn’t.
“Alexa shouldn’t be offensive. It should be compassionate. It should be factual. If you asked her for a Wikipedia story, it should be very factual and sort of look like an expert in that field,” Prasad said. “These are hard questions. Have we solved it perfectly? No. But have we done a great job? I think yes.”
All Rights Reserved for Dave Gershgorn