The GPT-4o system map points to strange risks with voice assistants

August 12, 2024

178

OpenAI has released the system map for its advanced GPT-4o model and explained the novel risks posed by its audio features.

It's been several months because the impressive demos of GPT-4o's voice assistant, which interacts with near real-time dialogues. OpenAI said extensive testing is required before the voice feature may be used safely and recently only gave some alpha testers access to the feature.

The newly published System card gives us an insight into the strange behavior of the voice assistant during testing and the measures OpenAI took to influence its behavior.

At one point during testing, the voice assistant shouted “No!” after which continued with its response, but this time it was imitating the user's voice. This was not a response to a jailbreak attempt and appears to be related to the background noise within the prompt audio.

OpenAI says it has observed “rare cases where the model inadvertently generated output that emulated the user's voice.” GPT-4o can imitate any voice it hears, but the danger of giving users access to this feature is critical.

To mitigate this, the system prompt only allows using the preset voices. They have also “developed a standalone output classifier to detect if the GPT-4o output uses a distinct voice than our approved list.”

OpenAI says it remains to be working on a fix for reducing security robustness when audio input is poor quality, has background noise, or incorporates echoes. We'll likely see some creative audio jailbreaks.

At the moment, it doesn't appear to be we are able to get GPT-4o to talk in Scarlett Johansson's voice. However, OpenAI says that “unintended voice generation remains to be a weakness of the model.”

Powerful functions are switched off

OpenAI has also disabled GPT-4o's ability to discover the speaker from audio input. OpenAI says that is to guard individuals' privacy and “potential surveillance risks.”

Unfortunately, once we eventually get access to the voice assistant, it won't give you the option to sing. OpenAI has disabled this feature, amongst other measures, to avoid any potential copyright issues.

It's an open secret that OpenAI has used copyrighted content to coach its models, and this risk mitigation seems to verify that. OpenAI said: “We trained GPT-4o to reject requests for copyrighted content, including audio, according to our broader practices.”

During the test, Red Team members also managed to “force the model to generate inaccurate information by causing it to verbally repeat false information and develop conspiracy theories.”

This is a known issue with ChatGPT's text output, but testers were concerned that the model is likely to be more persuasive or harmful if it reproduced the conspiracy theories in an emotional voice.

Emotional risks

Some of the most important risks related to GPT-4o's enhanced language mode might not be fixable in any respect.

Humanizing AI models or robots is a trap that is straightforward to fall into. According to OpenAI, the danger of attributing human-like behaviors and characteristics to an AI model is larger if it speaks with a voice that sounds human.

It was found that some users involved in early testing and red teaming used language that suggested that they had formed a reference to the model. When users interact with AI and form emotional bonds with it, this could affect interpersonal interactions.

If a user interrupts GPT-4o, it should gladly let him interrupt as an alternative of scolding him for his rudeness. Such behavior shouldn’t be appropriate in interpersonal interactions.

OpenAI says: “Users could construct social relationships with the AI, reducing their need for human interaction – potentially benefiting lonely people, but potentially impacting healthy relationships as well.”

The company is clearly putting a variety of work into the safety of GPT-4o's voice assistant, but a few of these challenges could also be insurmountable.

The GPT-4o system map points to strange risks with voice assistants

Powerful functions are switched off

Emotional risks

LEAVE A REPLY Cancel reply

Must Read

ChatGPT Is Making People Think They’re Gods and Their Families Are Terrified

Meta accelerates the “Mad Men to Math Men” pipeline

Deepseek: Everything it is advisable to know concerning the AI -Chatbot app

Profit VS Humanity: debate on the management of KIS corporate management

AI May Soon Help You Understand What Your Pet Is Trying to Say

OpenAI chief Sam Altman: ‘This is genius-level intelligence’

Openaai, Microsoft tell the Senate that no country can win AI.

Latest articles

ChatGPT Is Making People Think They’re Gods and Their Families Are Terrified

Meta accelerates the “Mad Men to Math Men” pipeline

Deepseek: Everything it is advisable to know concerning the AI -Chatbot app

Our Newsletter

The GPT-4o system map points to strange risks with voice assistants

Powerful functions are switched off

Emotional risks

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter