HomeArtificial IntelligenceResemble AI's next-generation AI audio detection model, Detect-2B, has an accuracy of...

Resemble AI's next-generation AI audio detection model, Detect-2B, has an accuracy of 94%

Voice cloning company They seem like AI has released the subsequent generation of its deepfake detection model, which has an accuracy of about 94%.

Detect-2B uses a series of pre-trained submodels and fine-tuning to look at an audio clip and determine whether it was generated using AI.

“Building on the strong foundation of our original Detect model, DETECT-2B represents a significant advancement by way of model architecture, training data and overall performance. The result’s a highly robust and accurate deepfake detection model that achieves remarkable levels of performance when evaluated on an enormous dataset of real and faux audio clips,” the corporate said. in a blog post.

According to Resemble, Detect-2B's submodels “consist of a frozen audio representation model with an adaptation module inserted into key layers.” The adaptation module directs the models' deal with artifacts – or random noises left in a recording – that usually distinguish real audio from fake. Most AI-generated audio clips can sound “too clean.” Detect-2B can predict how much audio is AI-generated without having to retrain the model each time a latest clip is listened to. The submodels are also trained on large datasets.

Detect-2B aggregates its prediction scores and compares them to a “fastidiously tuned threshold” before deciding whether a recording is real or fake. Resemble said the best way the researchers structured Detect-2B allows for rapid training without requiring as much computing power to deploy.

Stochastic architectures make working with audio signals easier

The model's architecture is predicated on Mamba-SSM or state space models, which don’t rely upon static data or recurring patterns. Instead, it uses a stochastic or random probability model that responds higher to different variables. According to Resemble, any such architecture works well in audio detection since it captures different dynamics in an audio clip, adapts between states of an audio signal, and continues to work even when the recording is of poor quality.

To evaluate the model, Resemble subjected Detect-2B to a test that included unseen speakers, deepfake-generated audio, and various languages. The company said the model accurately detected deepfake audio for six different languages ​​with not less than 93% accuracy.

Resemble introduced its AI speech platform Rapid Voice Cloning in April. Detect-2B will likely be available via an API and could be integrated into various applications.

Detecting deep fakes has grow to be more essential

In the run-up to the 2024 US presidential election, identifying AI-generated voices or videos is gaining importance. AI voices could make it easier to mislead voters and spread misinformation. Concerns about AI deepfakes, whether it's faking a politician's voice, spoofing a celeb in a song, or just using AI as an instance something, have undermined trust in brands.

Tools like Detect-2B could go a good distance in identifying and detecting deep fakes before they reach the general public. Of course, Resemble shouldn’t be the one company working on detecting AI clones. McAfee launched Project Mockingbird in January to detect AI audio. Meta, then again, is developing a technique to add watermarks to AI-generated audio.

“But our work is way from over. As the capabilities of generative AI proceed to advance, our detection capabilities must also evolve. We have several exciting research directions planned to further improve DETECT-2B, specializing in areas akin to representation learning, advanced model architectures, and data augmentation,” said Resemble.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read