Researchers at Microsoft have made a great breakthrough in speech recognition, creating a technology that recognizes the words in a conversation as well as a person does.
The team of researchers and engineers of Microsoft created this system that makes the same or fewer errors than professional transcriptionists, according to a paper submitted by researchers.
The 5.9% error rate is equal to that of people, who were asked to transcribe the same conversation. This is the lowest ever recorded when considered the industry standard switchboard speech recognition task.
We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is an historic achievement.”
Harry Shum, the executive vice president of Microsoft Artificial Intelligence and Research group told that he wouldn’t have thought that they could have achieved this. He also wouldn’t have thought it would be possible.
Geoffrey Zweig who manages the Speech & Dialog research group told that this achievement is culmination of their 20 years of effort.
The major milestone is that, for the first time, a computer can identify the words in a conversation as well as a human would.
To reach this milestone, the team used Microsoft’s Computational Network Toolkit (CNTK), which was a home-grown system for in-depth learning that the research team has made available on GitHub via an open source license.