Artificial intelligence (AI) has more and more uses in more fields, and although its use for the generation of images from natural text descriptions has been very popular in recent months, now the time has come to switch from image to sound, specifically the human voice.
AudioLM’s AI is capable of reproducing a speaker’s pitch, timbre, articulation, and even pauses for breathing
One of the latest presentations by Google’s research division has been AudioLM, an AI capable of generate high-quality audio from a few-second human voice recording. One of its distinctive characteristics is that it does not require a prior training process based on previous transcriptions, maintaining the syntactic and semantic nature of the speaker on which he develops his new “discourse”.
beyond being able to faithfully reproduce the pitch, timbre, intensity, or articulation of the source voice You can also add the speaker’s breathing sounds and, of course, make meaningful sentences. AudioLM achieves this from the analysis of semantic and acoustic markers, the former acting as conditioning factors for the third.
Based on these capabilities, AudioLM is also capable of translating texts into speech or allowing computer systems or intelligent assistants to generate synthetic voice. At the moment Google has not opened the use of AudioLM to the public, but this is not the only AI specialized in this work.
One of the most popular cases in which this technology has been used has been the series “Obi Wan Kenobi”, from the streaming platform Disney+. In it, the voice of the character Darth Vader has not been dubbed by the actor who is traditionally in charge of voicing him on the movie screen, James Earl Jones, but instead his voice has been recreated and “cloned” by the Ukrainian company Respeecher.
Jones signed a contract that allowed the archive of recordings to be processed with his voice by AI so that the studio could have new lines of dialogue with the voice of the Dark Lord of the Sith without the actor, 91, having to Go to the dubbing studio.
Precisely another product from the Disney factory, Lucasfilm division, also had the use of this technology to recreate the voice of Luke Skywalker in the series “The Book of Boba Fett” synthetically.