DeepMind Technologies: Google Teaching Computers to Speak, Produce Human-Sounding Voice

By Abdul Muqeet, | September 11, 2016

Google DeepMind

Google DeepMind

Google's artificial intelligence unit DeepMind has made tremendous breakthroughs in producing realistic sounding artificial sounds. Using a technology which generates voices by sampling real human speech and directly modeling audio waveforms based on it, Google researchers have been able to create human-like sound.

Like Us on Facebook

The system is being called WaveNet. In the tests conducted by Google, people found WaveNet speeches more human compared to other voice assistants and text-to-speech applications. However, it underperformed when it was compared to real human speeches.

There are many virtual assistants currently on the market like Apple's Siri, Microsoft's Cortana and Amazon's Alexa, but their speech is made by stringing together various fragments of words and speeches which result in a computer-sounding, halting, and emotionless sound.

In a blog post published on Friday, DeepMind, which was bought by Google back in 2014, said the artificial intelligence used in WaveNet could mimic human speech by learning how to form the individual sound waves a human voice creates.

Not only would does that allow for more natural-sounding speech, but it also allows the computer to mimic virtually any sound, including the potential ability to realistically reproduce music.

As amazing as it sounds and with countless places of applications that come to mind, WaveNet has its limitations.

DeepMind said that the technology is not yet ready for commercial use since to calculate and produce a reply it has to sample the audio signal it is being trained on 16,000 times per second or more. It then has to formulate a prediction as to what the resulting sound wave should look like based on previous samples.

WaveNet may not be integrated into Google Assistant soon, but it is fair to assume that it would be ready for the market in a few years.

©2024 Telegiz All rights reserved. Do not reproduce without permission
Real Time Analytics