Disclaimer: We may earn a commission if you make any purchase by clicking our links. Please see our detailed guide here.

Follow us on:

Google News

Google Cloud Improves Text-to-speech and Speech-to-text

Bhaswati Sarkar
Bhaswati Sarkar
She likes to lose herself in music and daydreams quite often. Travelling excites her and photography is her passion- nature is her favorite subject. Writing is cathartic for her. A happy-go-lucky kind of person, she tries to remain calm and serene through daily life.

Join the Opinion Leaders Network

Join the Techgenyz Opinion Leaders Network today and become part of a vibrant community of change-makers. Together, we can create a brighter future by shaping opinions, driving conversations, and transforming ideas into reality.

On Thursday, Google Cloud announced many improvements to the platform’s AI-powered speech tools.

Google Cloud took the decision to update its Text-to-Speech products by providing additional voices and languages to it, including beta support for new languages or variants, including Danish, Norwegian Bokmål, Polish, Portuguese/Portugal, Russian, Slovakian, and Ukrainian, making the product support a total of 21 languages as of now.

Moreover, the product now supports a total of 106 voices after adding 31 new WaveNet voices and 24 new standard voices. This makes Amazon Web Services’ Polly, which supports a total of 58 voices, the primary competition for Google’s Text-to-Speech services.

Thanks to unique access to WaveNet technology powered by Google Cloud TPUs, we can build new voices and languages faster and easier than is typical in the industry. – Dan Aharon, Google product manager

To help users enhance audio playback on various hardware, like headphones for podcasts, Google Cloud’s latest update includes the general availability of Google’s Text-to-Speech Device Profiles feature.

Google Cloud also improved its speech-to-Text transcription tools’ general availability and quality.

It announced the general availability of multi-channel recognition enabling Speech-to-Text API distinction between multiple audio channels, which would come in handy in situations involving multiple people.

Last year, Google produced beta-accessible premium models for video and enhanced phone, which are now generally available. Data logging for premium-services customers in order to share usage data was made use of by Google to improve its video and phone models.

Google announced the improved video model to have 64 percent fewer transcription errors and the phone model to have 62 percent fewer errors.

Prices for the premium phone and video models have been slashed. The upgraded phone and video models can be used without opting for data logging, but opting for data logging would cost customers less for the products.

Through these updates, developers would benefit in building intelligent voice applications that can reach a wider audience along with providing greater efficiency and functionality.


Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Power Your Business

Solutions you need to super charge your business and drive growth

More from this topic