OpenAI's Strawberry program is supposedly able to rational thought. It could deceive people

September 25, 2024

124

OpenAI, the corporate that developed ChatGPT, has developed a brand new artificial intelligence (AI) system called strawberry. It shouldn’t be only designed to supply quick answers to questions, like ChatGPT, but in addition for reflection or “argumentation”.

This raises several major concerns. If Strawberry is indeed able to some type of logical reasoning, could this AI system cheat and deceive humans?

OpenAI can program AI to limit its ability to govern humans. But the corporate's own assessments classify it as “medium risk” because it might probably assist experts in “operational planning for the reproduction of a known biological threat” – that’s, a biological weapon. It was also classified as medium risk because it might probably persuade people to vary their pondering.

It stays to be seen how such a system might be utilized by individuals with bad intentions, equivalent to scammers or hackers. Nevertheless, OpenAI's assessment states that systems with medium risk may be released for wider use – a position I imagine is wrong.

Strawberry shouldn’t be one AI “model” or program, but several – collectively referred to as o1. These models are intended Answer complex questions and solve complicated math problems. You'll also find a way to put in writing computer code – to enable you construct your individual website or app, for instance.

The apparent ability to reason may surprise some, as this is mostly considered a precursor to judgment and decision-making – something that has often been considered a distant goal in AI. So, at the least on the surface, this seems to bring artificial intelligence one step closer to human intelligence.

When things look too good to be true, there is commonly a catch. Well, this range of latest AI models is designed to maximise their goals. What does this mean in practice? To achieve the specified goal, the trail or strategy chosen by the AI can not all the time necessarily fairor align with human values.

True intentions

For example, in the event you were to play chess against Strawberry, his reasoning could theoretically enable him to hack the points system as an alternative of checking out the perfect strategies to win the sport?

The AI could also find a way to misinform humans about its true intentions and capabilities, which might pose a serious security risk if widely deployed. For example, if the AI knew it was infected with malware, it could then “determine” cover up this fact knowing that a human operator could determine to disable your entire system if he knew about it?

Strawberry goes beyond the capabilities of AI chatbots.
Robert Way / Shutterstock

These could be classic examples of unethical AI behavior, where cheating or deception is appropriate if it results in a desired goal. It would even be faster for the AI, because it wouldn't must waste time determining the subsequent best step. However, it doesn't necessarily must be morally correct.

This results in a relatively interesting but in addition disturbing discussion. What sort of reasoning is Strawberry able to and what might the unintended consequences be? A strong AI system that may cheat humans could pose serious ethical, legal and financial risks for us.

Such risks grow to be serious in critical situations, equivalent to the event of weapons of mass destruction. OpenAI classifies its own Strawberry models as “medium risk” because they assist scientists develop chemical, biological, radiological and nuclear weapons.

OpenAI says: “Our evaluations have shown that o1-preview and o1-mini might help experts in operational planning for the reproduction of a known biological threat.” However, it also states that experts have already got considerable expertise in these areas, so the chance in practice could be limited. It goes on to say: “The models don’t enable laypeople to create biological threats, because creating such a threat requires practical laboratory skills that the models cannot replace.”

Persuasiveness

OpenAI's evaluation of Strawberry also examined the chance that it could persuade people to vary their beliefs. The recent o1 models proved to be more persuasive and manipulative than ChatGPT.

OpenAI also tested a mitigation system that would reduce the AI system’s manipulative capabilities. Overall, Strawberry was rated as medium risk for “persuasion” in Open AI's tests.

Strawberry was classified as low-risk resulting from its ability to operate autonomously and its cybersecurity.

Open AI's policy states that “medium risk” models may be released for broad use. In my view, this understates the threat. The use of such models could have catastrophic consequences, especially if malicious actors manipulate the technology for their very own purposes.

This requires strict controls and counterbalances that can only be possible through AI regulation and legal frameworks, equivalent to penalising incorrect risk assessments and the misuse of AI.

The UK government stressed the necessity for “safety, security and robustness” in its 2023 AI White Paper, but this is much from enough. There is an urgent must prioritise human safety and develop rigorous testing protocols for AI models like Strawberry.

OpenAI's Strawberry program is supposedly able to rational thought. It could deceive people

True intentions

Persuasiveness

LEAVE A REPLY Cancel reply

Must Read

Write for Us: Open invitation to industry experts and passionate writers

AI’s attack on our mental property have to be stopped

ChatGPT: Everything you could know in regards to the AI-powered chatbot

Ecologists discover the blind spots of computer vision models when retrieving wildlife images

Hugging Face shows how scaling test time helps small language models punch above their weight

“Genius Girl” ranges from inspiring a personality from a Korean TV show to raising a $100 million AI fund

Slack becomes an AI workplace: This is what it means to your job

Latest articles

Write for Us: Open invitation to industry experts and passionate writers

AI’s attack on our mental property have to be stopped

ChatGPT: Everything you could know in regards to the AI-powered chatbot

Our Newsletter

OpenAI's Strawberry program is supposedly able to rational thought. It could deceive people

True intentions

Persuasiveness

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter