I dabbled in
Voice cloning using Resemble.AI. This is an AI voice generator. It claims to clone your voice with as little as 3 mins of training audio.
How it works
Resemble.AI collects ~ 25 short audio samples of 3-5 seconds each. You need to read out the short phrases presented.
After this, you need to wait till your AI voice is ready. This took ~ 12 hours for me.
Once your AI voice is ready, it will speak out any text, in your voice.
So what was the result
This is the one generated by Resemble.
This is me saying the same thing.
Observations:
The voice cloning is good. The voice sounds a bit robotic though Hopefully this can be improved with more training (available in the paid plans)
Pauses and voice modulation can be improved
Does not recognize (and auto-correct) spelling mistakes
This can be improved by using the voice cloning service as an API and embedding the following features:
Embed pauses and voice modulation with SSML
Auto correct spelling errors
Voice Cloning use cases
Creating online courses
Create custom messages to every customer from your CEO
Let me know what you think about this in the comments!