OpenAI’s New Voice Engine Can Clone a Voice in 15 Second

Reading time: 2 min

First published: Apr 3, 2024

Updated 2 times since publishing

Written by Shipra Sanganeria Cybersecurity & Tech Writer
Fact-Checked by Justyn Newman Lead Cybersecurity Editor

On March 29th 2024, OpenAI announced its new offering known as Voice Engine in a blog post. The software is currently available to a select group of users, due to fear of its misuse in creating deceptive fake audios.

The text-to-speech software under development since late 2022, allows a user to create synthetic voice of anybody based on a 15-second audio clip. The application, which is already a part of ChatGPT’s Read Aloud feature, was initially trained using a mix of licensed and publicly available information, said Jeff Harris, a member of OpenAI’s product team for Voice Engine in an interview with TechCrunch.

However, in 2023, the company started re-testing the tool with a small group of trusted partners in the education, and healthcare sectors.

“These small-scale deployments are helping to inform our approach, safeguards, and thinking about how Voice Engine could be used for good across various industries,” said OpenAI in its blog post.

The few organizations that OpenAI partnered with include; education technology company Age of Learning, AI visual storytelling platform; HeyGen, for-profit social enterprise; Dimagi, AI alternative communication app Livox; and not-for-profit health system, The Norman Prince Neurosciences Institute at Lifespan.

The company said that these real-world applications of Voice Engine generated impressive results, including restoring the voice of patients suffering from speech impairment, translating content in local languages, and providing reading assistance.

Despite the successful outcome, OpenAI is yet to decide on its availability for public use. However, the company has started putting safeguards in place for its public debut in the near future.

OpenAI requires its partners to get “explicit and informed consent” as well as have a legal right to the original speaker’s voice before generating any synthetic audio. Users of the technology would also be required to disclose to listeners that the audio clip is AI-generated. OpenAI will be watermarking the audio generated by Voice Engine to trace its origin and monitor its usage as well.

OpenAI’s New Voice Engine Can Clone a Voice in 15 Second

We're thrilled you enjoyed our work!

Leave a Comment