sideload: ¿How to and why generate a low cost cloning voice?

Friday, 10 May 2024

¿How to and why generate a low cost cloning voice?

Voice cloning is a process that is commonplace in our daily lives. But for to sideload it is necessary to clone the voice to give realism to our avatar and a more natural ability to communicate. Nowadays, the best programs need only twenty seconds of voice recording to clone, including intonation and accent, any human voice realistically and use it as the basis for a text-to-speech engine. Because the verbal production in the avatar is textual and has to be read aloud to communicate with its interlocutor.

But there is a catch, most computers on which to simulate the avatar have limited information processing capacity and this requires other, less demanding strategies, as neural networks are very expensive programs in this respect.

And sometimes what is old but good is still valid. Early TTS used sampling and mixing, i.e. extracting samples of word pairs and their associated sound and then remixing them to a text input to the system, resulting in a monotonous, robotic but recognizable human voice.

Low-TTS is of that nature, the user will have to add his vocabularies day by day until it meets his needs as much as possible.

Meanwhile, we continue to investigate realistic voice cloning at low cost to obtain a satisfactory solution.

Source: https://github.com/marcobaturan/Low-Cost-TTS

Friday, 10 May 2024

¿How to and why generate a low cost cloning voice?

No comments:

Post a Comment

Trending

Practical Guide to Pet Sideloading: Preserving Your Companion's Essence

popular

Labels

Blog Archive