AI

NetEase Youdao Open-Sources EmotiVoice: A Powerful and Modern Text-to-Speech Engine

1 Mins read

NetEase Youdao announced the formal release of the “Yi Mo Sheng”: An open-source text-to-speech (TTS) engine. It is available on GitHub. The web and script interfaces it offers make it possible to generate results in batches, making it ideal for applications requiring emotional synthesis of timbres.

Youdao created this text-to-speech engine. It presently has more than 2,000 timbres and supports both Chinese and English. It also contains a one-of-a-kind emotion synthesis feature that may create feelings of joy, excitement, sadness, or anger. And a plethora of expressive vocalizations.

Regarding open-source text-to-speech engines, EmotiVoice is at the top of the game. EmotiVoice has over 2000 unique voices and can converse in English and Chinese. The most noticeable function is emotional synthesis, allowing you to generate speech with a wide spectrum of emotions, including happiness, eagerness, sadness, furiousness, and others.

There is a user-friendly online interface available. The findings can be generated in bulk via a scripting interface. Docker images make it simple to test out EmotiVoice. A computer with an NVidia graphics processing unit is required. Install the NVidia container toolkit on Linux or Windows WSL2 if you haven’t already. 

In the current system, prompts manage how a user feels or acts. It disregards gender in favor of emphasis on tone, tempo, intensity, and passion. A style/timbre controller, like the original closed-source design, can be added rather easily.


Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.



Source link

Related posts
AI

GORAM: A Graph-Oriented Data Structure that Enables Efficient Ego-Centric Queries on Federated Graphs with Strong Privacy Guarantees

3 Mins read
Ego-centric searches are essential in many applications, from financial fraud detection to social network research, because they concentrate on a single vertex…
AI

Arcee AI Releases SuperNova-Medius: A 14B Small Language Model Built on the Qwen2.5-14B-Instruct Architecture

3 Mins read
In the ever-evolving world of artificial intelligence (AI), large language models have proven instrumental in addressing a wide array of challenges, from…
AI

Researchers at Stanford University Propose ExPLoRA: A Highly Effective AI Technique to Improve Transfer Learning of Pre-Trained Vision Transformers (ViTs) Under Domain Shifts

2 Mins read
Parameter-efficient fine-tuning (PEFT) methods, like low-rank adaptation (LoRA), allow large pre-trained foundation models to be adapted to downstream tasks using a small…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *