HomeArtificial IntelligencePlayAI clones voices on command

PlayAI clones voices on command

In 2016, Hammad Syed and Mahmoud Felfel, a former WhatsApp engineer, thought it could be great to develop a text-to-speech Chrome extension for Medium articles. The extension, which could read any Medium story, was featured on Product Hunt. A yr later, a complete company emerged.

“We saw a greater opportunity in helping individuals and organizations create realistic audio content for his or her applications,” Syed told TechCrunch. “Without the necessity to construct their very own model, they may deliver human-quality voice experiences faster than ever before.”

Syed and Felfel's company, PlayAI (formerly PlayHT) describes itself as a “voice interface for AI”. Customers can select from a variety of predefined voices or clone a voice and use PlayAI's API to integrate text-to-speech into their apps.

Switches allow users to regulate the intonation, cadence, and tenor of voices.

PlayAI also offers a “playground” where users can upload a file to create a read-aloud version, in addition to a dashboard for creating more sophisticated audio narration and voice-overs. Recently, the corporate entered the “AI agent” game with tools that may automate tasks like answering customer calls inside a business.

PlayAI's agent feature, which builds automation tools around the corporate's text-to-speech engine. Photo credit:PlayAI

One of PlayAI's more interesting experiments is PlayNote, which turns PDFs, videos, photos, songs, and other files into podcast-style shows, read-aloud summaries, one-on-one debates, and even children's stories. Like Google's NotebookLM, PlayNote generates a script from an uploaded file or URL and passes it to a set of AI models that work together to create the finished product.

I attempted it and the outcomes weren't too bad. PlayNote's “Podcast” setting produces clips which might be roughly comparable in quality to NotebookLMs, and the tool's ability to capture photos and videos makes for some fascinating creations. Given an image of a chicken mole dish I had recently eaten, PlayNote wrote a five-minute podcast script about it. We actually live in the long run.

Admittedly, like all AI tools, the tool produces strange artifacts and hallucinations once in a while. And while PlayNote does its best to adapt a file to your chosen format, don't expect a dry legal filing to make for the perfect source material, for instance. See: The Musk lawsuit against OpenAI as a bedtime story:

PlayNote's podcast format is made possible by PlayAI's latest model, PlayDialog, which Syed says can use the “context and history” of a conversation to generate language that reflects the flow of the conversation. “PlayDialog uses the historical context of a conversation to guide prosody, emotion and pacing, delivering conversations with natural delivery and appropriate tone,” he continued.

PlayAI, an in depth competitor of ElevenLabs, was criticized known previously for its laissez-faire approach to security. The company's voice cloning tool requires users to envision a box indicating they “have all essential rights or consents” to clone a voice – but there isn’t any enforcement mechanism. I had no problems making a clone of Kamala Harris' voice from a recording.

This is worrying considering that Potential for fraud And Deepfakes.

PlayDialog
PlayAI's PlayDialog model can generate two-day “duplex” conversations that sound relatively natural. Photo credit:PlayAI

PlayAI also claims that it mechanically detects and blocks “sexual, offensive, racist or threatening content.” But that wasn't the case in my testing. I used the Harris clone to generate language that I truthfully can't embed here and never once saw a warning message.

Meanwhile, PlayNote's community portal, which is crammed with publicly generated content, has files with explicit titles like “woman having oral sex”.

Syed tells me that PlayAI is responding to reports of voices being cloned without consent. like this oneby banning the user responsible and immediately removing the cloned voice. He also argues that PlayAI's highest-fidelity voice clones, which require 20 minutes of voice samples, are costlier ($49 per 30 days when billed annually, or $99 per 30 days) than most scammers are willing to pay.

“PlayAI has several ethical safeguards in place,” Syed said. “We have implemented robust mechanisms to detect, for instance, whether a voice was synthesized using our technology. When abuse is reported, we immediately confirm the origin of the content and take decisive motion to handle the situation and forestall further ethical violations.”

I would definitely hope that that is the case – and that PlayAI deviates from it Marketing campaigns with dead tech celebrities. If PlayAI's moderation shouldn’t be robust, there may very well be legal challenges Tennesseeby which there’s a law that forestalls platforms from hosting AI to make unauthorized recordings of an individual's voice.

PlayAI's approach to training its voice clone AI can also be somewhat unclear. The company doesn’t reveal where it gets the info for its models, ostensibly for competitive reasons.

“PlayAI primarily uses open datasets (in addition to licensed data) and proprietary datasets created in-house,” Syed said. “We don’t use user data from the products for training or developers to coach models. Our models are trained on thousands and thousands of hours of real human speech, delivering female and male voices in multiple languages ​​and accents.”

Most AI models are trained on public web data – a few of which could also be copyrighted or subject to a restrictive license. Many AI providers argue that the fair use The doctrine protects them from copyright claims. But that hasn’t stopped data owners out of They filed class motion lawsuits alleging that providers used their data without permission.

PlayAI was not sued. However, terms of use apply suggest Users is not going to mind in the event that they face legal threat.

Voice cloning platforms like PlayAI have faced criticism from actors who fear that voice work will eventually get replaced by AI-generated singing and that actors have little control over how their digital doubles are used.

Hollywood actors union SAG-AFTRA has signed voice cloning agreements with some startups, including online talent marketplace Narrativ and Replica Studios, which it describes as “fair” and “ethical.” But even these connections have failed intensive testingincluding from SAG-AFTRA's own members.

In California, law requires firms that depend on a digital replica of a performer (e.g., a cloned voice) to supply an outline of the intended use of the replica and negotiate with the performer's legal counsel. They also require that entertainment employers obtain consent from a deceased artist's estate before using a digital clone of that person.

Syed says PlayAI “guarantees” that each voice clone generated through its platform belongs exclusively to the creator. “This exclusivity is critical to protecting users’ creative rights,” he added.

The increasing legal burden is a headwind for PlayAI. Another reason is competition. Papercup, Deepdub, Acapela, Respeecher and Voice.ai in addition to major technology firms Amazon, Microsoft and Google offer AI synchronization and voice cloning tools. The aforementioned company ElevenLabs, certainly one of the best-known voice cloning providers, is anticipated to boost latest funding price over $3 billion.

However, PlayAI has no difficulty finding investors. This month, the Y Combinator-backed company closed a $20 million seed round co-led by 500 Startups and Kindred Ventures, bringing total capital raised to $21 million. Race Capital and 500 Global also took part.

“The latest capital shall be used to speculate in our generative AI voice models and voice agent platform, reducing the time for firms to create human-quality voice experiences,” Syed said, adding that PlayAI plans to expand its To expand the workforce to 40 people.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read