The Voicebot Chronicles: How Well Do Our Machines Understand Us?

07:10

May 15, 2020

Save Article

Failed to save article

Please try again

A smart speaker is displayed at the TMall Genie booth during the CES 2019 Convention at the Las Vegas Convention Center on Jan. 8, 2019. (David Becker/Getty Images)

Voicebots, or voice-activated digital assistants like Siri, Alexa, Cortana and the Google Assistant, are getting pretty good at responding to simple requests, like asking for the weather or to hear the latest news headlines from KQED.

But even though tens of millions of people in this country regularly use them, these bots are still not very smart.

Voicebots often mis-hear things. Like a few months ago, when the Apple assistant, Siri, accidentally went off in British Parliament when it mistook a politician saying the word "Syria" for its own name, "Siri."

And if you've ever tried to ask a voicebot to tell you a joke ...

Q: What song always gets chickens dancing at a wedding?
A: The Y-Hen-C-A.

Sponsored

... You'll know these technologies haven't yet been imbued with much of a sense of humor.

The reality is that we’re still light years away from having voicebots resemble Samantha, the charismatic computer-based personality from Spike Jonze’s 2013 movie “Her.”

So when it comes to bridging the gap between today’s digital voice assistants and those of the future, experts are learning a lot from the rules governing how real people talk to each other — including one set of rules known as the "cooperative principle."

"The basis for the cooperative principle is that in general, we want to have a good conversation," said Cathy Pearl, head of conversation design outreach at Google. "We want to be understood and understand the other person."

Cathy Pearl is the head of conversation design outreach at Google. (Courtesy Cathy Pearl)

Invented by philosopher of language Paul Grice in the 1970s, the cooperative principle describes how people can get along in common social situations. The four maxims are as follows:

Try to say stuff that is true. Don’t lie.
Don’t say too little or too much. Simply say what’s required to get your point across.
Stick to the relevant details. Don’t include a bunch of information that has nothing to do with the topic at hand.
Express yourself clearly. Avoid things like ambiguity.

Pearl said that until Grice came along with his theory, no one had really broken down the components of conversations in a way that could help technologists get closer to building voice systems that actually work.

"When you have a frustrating experience, it's often because the cooperative principle was broken," Pearl said of the challenges of conversation.

People break those rules all the time and Pearl said the theory has helped explain why some conversations go wrong in her own life.

"For example, if I ask my son, who's 11, 'Do you know what time it is?' and he says, 'Yes,' the reason that's annoying is because he's breaking the cooperative principle," she said. "He knows I really want to know the time, and he's taking it literally."

But following the principle doesn’t solve all conversation problems between humans or between humans and voicebots.

Pearl said one of the challenges is that humans tend to ask for the same thing in several different ways. Take a seemingly simple task, like ordering pizza.

"The way I order pizza may not be exactly the way you do it," Pearl said. "I might blurt everything out at once, like, 'I want three large mushroom pizzas!' Where someone else might start the conversation with, 'I want to order a pizza.' "

The Voicebot Chronicles: How Well Do Our Machines Understand Us?

Download

Pearl said voice interface designers have to try to think of all the many different ways people might ask for pizza. And that’s really hard.

"Once it's out in the wild, you will still find that people might respond in a way that you didn't quite expect," Pearl said.

Getting voicebots to a place where they can better grasp our wants and needs, and even be funny, will take time.

But to be fair, we have a lot more experience at it than they do.

"I mean, we humans have been using language and speaking for about 150,000 years," Pearl said. "And computers have been doing it, what? Let's say, 20 to 50."

That’s why, Pearl said, for now we should cut the machines some slack, even when they tell a really bad joke.

This story is part The Voicebot Chronicles, an experimental interactive series from KQED designed for smart speakers about navigating a world where the human voice is increasingly mediated by technology.

To experience The Voicebot Chronicles on Alexa, say “Alexa, open The Voicebot Chronicles.” On Google Assistant, say “Hey Google, talk to The Voice Bot Chronicles.” Estimated experience time: 20 minutes.

The Voicebot Chronicles: How Well Do Our Machines Understand Us?

The Voicebot Chronicles: How Well Do Our Machines Understand Us?

Signed up.