NetEase Youdao Open-Sources EmotiVoice: A Powerful and Modern Text-to-Speech Engine

1 Mins read

NetEase Youdao announced the formal release of the “Yi Mo Sheng”: An open-source text-to-speech (TTS) engine. It is available on GitHub. The web and script interfaces it offers make it possible to generate results in batches, making it ideal for applications requiring emotional synthesis of timbres.

Youdao created this text-to-speech engine. It presently has more than 2,000 timbres and supports both Chinese and English. It also contains a one-of-a-kind emotion synthesis feature that may create feelings of joy, excitement, sadness, or anger. And a plethora of expressive vocalizations.

Regarding open-source text-to-speech engines, EmotiVoice is at the top of the game. EmotiVoice has over 2000 unique voices and can converse in English and Chinese. The most noticeable function is emotional synthesis, allowing you to generate speech with a wide spectrum of emotions, including happiness, eagerness, sadness, furiousness, and others.

There is a user-friendly online interface available. The findings can be generated in bulk via a scripting interface. Docker images make it simple to test out EmotiVoice. A computer with an NVidia graphics processing unit is required. Install the NVidia container toolkit on Linux or Windows WSL2 if you haven’t already. 

In the current system, prompts manage how a user feels or acts. It disregards gender in favor of emphasis on tone, tempo, intensity, and passion. A style/timbre controller, like the original closed-source design, can be added rather easily.

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.

Source link

Related posts

Researchers from MIT and ETH Zurich Developed a Machine-Learning Technique for Enhanced Mixed Integer Linear Programs (MILP) Solving Through Dynamic Separator Selection

3 Mins read
Efficiently tackling complex optimization problems, ranging from global package routing to power grid management, has been a persistent challenge. Traditional methods, notably…

This AI Research Unveils Alpha-CLIP: Elevating Multimodal Image Analysis with Targeted Attention and Enhanced Control"

3 Mins read
How can we improve CLIP for more focused and controlled image understanding and editing? Researchers from Shanghai Jiao Tong University, Fudan University,…

Researchers from AI2 and the University of Washington Uncover the Superficial Nature of Alignment in LLMs and Introduce URIAL: A Novel Tuning-Free Method

3 Mins read
Large Language Models (LLMs) are recent innovations in the field of Artificial Intelligence (AI) and Deep Learning. Some of the well-known LLMs,…



Leave a Reply

Your email address will not be published. Required fields are marked *