Trending Topics

Microsoft Demonstrates Real-Time Speech Translation for Skype

skype-real-time-translator

Microsoft will by the end of 2014 start offering on-the-fly language translation within Skype, first in a Windows 8 beta app and then hopefully as a full commercial product within the coming 2 1/2 years.

A couple years back, Microsoft and the University of Toronto demonstrated the rough ability to have someone speak into a microphone in English and find their words translated into spoken Mandarin. Microsoft’s researchers claimed low error rates through the use of deep neural networks — basically, artificial “brains” that have some capacity to learn features of voice, text and image data. IBM’s Watson also has deep learning in its arsenal of artificial intelligence techniques, and Google recently paid $400 million for a British company called DeepMind that plays in the same area.

Now Microsoft’s research is about to pay off in the form of Skype Translate. At Code Conference 2014 on Tuesday, CEO Satya Nadella and Gurdeep Pall, the head of Microsoft’s Skype and Lync division, showed off a similar technique embedded in the popular videoconferencing service.

English-speaking Pall held a conversation with a German colleague speaking in her native tongue – as a speaker finished a sentence, Skype would then read out the translation in the other speaker’s language. It wasn’t perfect and, unlike in the 2012 demo, the translation wasn’t read out in the speaker’s own voice, but it was certainly accurate enough to be useful.

Nadella expressed slight bemusement at certain capabilities of the new technology, particularly its capacity for “transfer learning”:

“You teach in English, it learns English. Then you teach it Mandarin — it learns Mandarin, but it becomes better at English. And then you teach it Spanish and it gets good at Spanish, but it gets great at both Mandarin and English. And quite frankly none of us know exactly why.”

In a Microsoft Research blog post accompanying Nadella’s announcement, the company lays out a timeline of major advances in speech recognition and machine translation. Among the major points were the advent of deep learning in 2006 (thanks to the work of University of Toronto professor and Google distinguished researcher Geoffrey Hinton) and Microsoft’s adoption of the technology in 2009. Other companies actively pursuing deep learning research include Facebook and Baidu.

Read the full story at gigaom.com

Back to top