"This is a problem that's been around for a long time," says computer scientist John Hershey, now at Google. "When people start to lose their hearing this is one of the first things to go — the ability to separate one voice from another."
Hershey says nobody knows how our brains are able to separate voices, so it's difficult to tell a computer how to do it. But when he worked at the Mitsubishi Electric Research Laboratory in Cambridge, Mass., Hershey and his colleague Jonathan Le Roux used a technique called deep learning that allowed a computer, over time, to learn how to separate voices.
Deep learning is all the rage in AI these days. It works something like this. You give the computer some input, in this case, the sound of people talking. To the computer, this is at first just meaningless noise. But then you give the computer a transcript of what the people were saying.
Like a baby learning new words, the computer figures out what sounds go with what words.
Once it has practiced, and practiced, and practiced, it can apply what it has learned to voices it's never heard before.
Hershey invited me to send him voices to see if his program could separate them from each other. I sent him a 10-second clip of NPR's Kelly McEvers and Ari Shapiro speaking simultaneously.
Here's what I sent.
And here's what the computer came up with.
In the separated recording you can clearly hear what Kelly is saying, but there's still a bit of Ari's voice in the background.
Hershey says the computer's flubs may be because Kelly's and Ari's voices are quite different from the voices used in the computer's training.
And that's a problem, according to critics of the deep learning approach.
"You still need a lot of data to make this technique work," says New York University psychologist Gary Marcus, who works on artificial intelligence.
Deep learning in computers resembles how scientists think the human brain works. The brain is made up of about 100 billion or so neurons. Researchers say the connections among these neurons change as people learn a new task. Something similar is going on inside a computer.
But human brains learn a lot of stuff on their own. Marcus says for deep learning, you need a lot of data to train the computer.
"And sometimes you can't find that data," he says.
Marcus worries people may be too enthralled with this approach to see its limitations.
"One of the key questions right now is how risky is it if I make a crazy error?" he says.
So let's say the computer in a driver-less car sees someone wearing a T-shirt with a picture on it of a highway receding into the distance. It's just possible the computer would be misled that the road on the shirt was a real road.
"They make a mistake. They're not perfect. And the question is how much does that cost you?" Marcus says.
He notes that if you're using artificial intelligence to pick a song people might like, an error is hardly catastrophic. "But if you made a pedestrian detector that's 99 percent correct that sounds good," he says, "... then you do the math and think about how many people would die every day if you had a fleet of those cars and it's really not very good at all."
But computer scientist Astro Teller is more positive about what AI can do. He heads a Google spinoff company called X. First of all, he says, car-driving computers are already getting it right more than 99 percent of the time. He says even if they don't always get it right, cars equipped with AI computers are likely to do way better than humans in unexpected situations.
But he also believes deep learning has its limits.
"Most people in the field of artificial intelligence are excited about deep learning, and the progress that it's making but I think very few of them think that it's going to be the whole nut," Teller says.
He's convinced that researchers will come up with new AI techniques that will make computers much smarter than they are today.
"I don't think there are any inherent limits in the kinds of problems computers can solve," Teller says. "And I hope that there aren't any limits."