HANNOVER MESSE 2019, 01 - 05 April
switch to:
Machine Learning

Google is teaching its voice services to be more communicative

The Internet giant is expanding its speech synthesis and speech recognition services. The idea is to convert cloud texts into speech and vice versa

12 Sep. 2018
HMI-ID08-090rf_google
Google is teaching its voice services to be more communicative (Photo:Google)

In a blog posting , Google announced the general availability of Cloud Text-to-Speech . The speech synthesis offered via the Internet has been expanded to include 14 languages, with Google counting American, British and Australian English among its own languages. The choice of speakers has been extended to 24 using WaveNet's neural network. The Deepmind technology developed by the London-based company analyzes audio recordings of real human speakers to make the speech sound more natural.

Google is also expanding its Cloud Speech-to-Text offering. In order to transcribe recordings of two speakers talking with each other by phone, the service simply uses the different channels to assign the texts to the respective persons. In case of recordings of conferences for example, users can leverage the programming interface (API) to inform the system of the number of participants. Cloud Speech-to-Text can subsequently differentiate the voices ever more easily in course of the conversation and update the assignments. Google has also added recognition of the respective language to its range of features.