Is this “the world’s most flexible sound machine?” Nvidia unveils its voice and music AI
It’s just a matter of time before AI comes for my job. I’m sure I’m already losing voiceover gigs to AI programs. Why pay someone to read the script for a dry corporate video when an audio engineer can just call up a text-to-speech AI to do it for free?
But at least I’m safe when a client wants me because I have a voice like, well, mine. They hire me because I do whatever it is I do.
Not so fast, sunshine. Nvidia, one of the world’s most valuable companies and a leader in AI, has just announced Fugatto, which is short for Foundational Generative Audio Transformer Opus 1. It can generate and/or transform any combination of voices, sounds, and music into whatever you prompt it to do. For example:
- It can create music based on a text prompt
- It can add or remove instruments or other audio elements from an audio sample
- For sampled/cloned voices, it can give them an accent (handy for ad agencies looking to target different regions with the same ad). I’m sure it has the ability to change the language spoken by that voice.
- For those same voice clones, it can add or subtract emotion.
On the plus side, Fugatto can:
- Help create prototype edits of new songs quickly.
- Add or subject elements to the mix and arrangement.
- Add special effects to existing songs.
The technology is pretty magical and will no doubt become a valuable tool in the world of audio. But I’ll say it again: We don’t need AI to create art or remove human artists from the equation. We need AI to do the sh*t we don’t want to do so humans have more time to create art.
Read more here.