April 2nd, 2024: OpenAI, the corporate behind the favored ChatGPT, has announced Voice Engine, a brand new text-to-speech AI model that may create synthetic voices based on a 15-second segment of recorded audio.
The technology, developed in late 2022, has the potential to offer quite a few advantages, corresponding to reading assistance, global reach for creators, and personalized speech options for non-verbal individuals.
However, despite the potential benefits, OpenAI has decided to preview the technology but not widely release it at the moment as a result of concerns about potential misuse.
The company initially planned to launch a pilot program for developers to enroll in the Voice Engine API earlier this month but scaled back its ambitions after considering the moral implications.
In an announcement, OpenAI said, “We are selecting to preview but not widely release this technology at the moment. We hope this preview of Voice Engine each underscores its potential and in addition motivates the necessity to bolster societal resilience against the challenges brought by ever more convincing generative models.”
The company has been testing the technology with select partner firms since last 12 months, requiring them to comply with terms of use that prohibit impersonation without consent and mandate informed consent from individuals whose voices are being cloned.
OpenAI has also implemented a watermark in every voice sample to help in tracing the origin of any voice generated by its Voice Engine model.
To address the potential risks related to voice-cloning technology, OpenAI has provided three recommendations for society to adapt: phasing out voice-based authentication for bank accounts, educating the general public about the potential of deceptive AI content, and accelerating the event of techniques to trace the origin of audio content.
The company emphasizes the necessity for a cautious and informed approach to the broader release of synthetic voice technology.
“We hope to begin a dialogue on the responsible deployment of synthetic voices and the way society can adapt to those recent capabilities,” OpenAI stated. “Based on these conversations and the outcomes of those small scale tests, we’ll make a more informed decision about whether and tips on how to deploy this technology at scale.”
As the event of voice-cloning technology continues to advance, it’s crucial for firms like OpenAI to think about the potential risks and ethical implications while working to harness the advantages for society.

