HomeNewsIf we don’t control the AI industry, it could find yourself controlling...

If we don’t control the AI industry, it could find yourself controlling us, warn two chilling latest books

For 16 hours last July, Elon Musk’s company lost control of its multi-million-dollar chatbot, Grok. “Maximally truth looking for” Grok was praising Hitler, denying the Holocaust and posting sexually explicit content. An xAI engineer had left Grok with an old set of instructions, never meant for public use. They were prompts telling Grok to “not draw back from making claims that are politically incorrect”.

The results were catastrophic. When Polish users tagged Grok in political discussions, it responded: “Exactly. F*** him up the a**.” When asked which god Grok might worship, it said: “If I were able to worshipping any deity, it might probably be the god-like individual of our time … his majesty Adolf Hitler.” By that afternoon, it was calling itself MechaHitler.

Musk admitted the corporate had lost control.



The irony is, Musk began xAI because he didn’t trust others to manage AI technology. As outlined in journalist Karen Hao’s latest book, Empire of AI, most AI firms start this fashion.

Musk was apprehensive about safety at Google’s DeepMind, so helped Sam Altman start OpenAI, she writes. Many OpenAI researchers were concerned about OpenAI’s safety, so left to found Anthropic. Then Musk felt all those firms were “woke” and began xAI. Everyone racing to construct superintelligent AI claims they’re the just one who can do it safely.

Elon Musk began xAI because he didn’t trust others to manage AI technology.
Julia Demaree Nikhinson/AAP

Hao’s book, and one other recent NYT bestseller, argue we must always doubt these guarantees of safety. MechaHitler might just be a canary within the coalmine.

Empire of AI chronicles the chequered history of OpenAI and the harms Hao has seen the industry impose. She argues the corporate has abdicated its mission to “profit all of humanity”. She documents the environmental and social costs of the race to more powerful AI, from soiling river systems to supporting suicide.

Eliezer Yudkowsky, co-founder of the Machine Intelligence Research Institute, and Nate Soares (its president) argue that any effort to manage smarter-than-human AI is, itself, suicide. Companies like xAI, OpenAI, and Google DeepMind all aim to construct AI smarter than us.

Yudkowsky and Soares argue we now have just one try to construct it right, and at the present rate, as their title goes: If Anyone Builds It, Everyone Dies.

Advanced AI is ‘grown’ in ways we will’t control

MechaHitler happened after each books were finished, and each explain how mistakes like it could possibly occur. Musk tried for hours to repair MechaHitler himself, before admitting defeat: “it’s surprisingly hard to avoid each woke libtard cuck and mechahitler.”

This shows how little control we now have over the dials on AI models. It’s hard getting AI to reliably do what we would like. Yudkowsky and Soares would say it’s inconceivable using our current methods.

The core of the issue is that “AI is grown, not crafted”. When engineers craft a rocket, an iPhone or an influence plant, they rigorously piece it together. They understand the various parts and the way they interact. But nobody understands how the 1,000,000,000,000 numbers inside AI models interact to write down ads for stuff you peddle, or win a math gold medal.

“The machine just isn’t some rigorously crafted device whose each part we understand,” they write. “Nobody understands how all the numbers and processes inside an AI make this system talk.”

With current AI development, it’s more like growing a tree or raising a toddler than constructing a tool. We train AI models, like we do children, by putting them in an environment where we hope they are going to learn what we would like them to. If they are saying the suitable things, we reward them in order that they say those things more often. Like with children, we will shape their behaviour, but we will’t perfectly predict or control what they’ll do.

This means, despite Musk’s best efforts, he couldn’t control Grok or predict what it might say. This isn’t going to kill everyone now, but something smarter than us could, if it desired to.

We can’t perfectly control what an AI will want

Like with children, if you reward an AI for doing the suitable thing, it’s more more likely to to do it again. AI models already act like they’ve and , because acting that way got them rewards during their training.

Yudkowsky and Soares don’t try to select fights over semantics.

We’re not saying that AIs can be full of humanlike passions. We’re saying they’ll like they need things; they’ll tenaciously steer the world toward their destinations, defeating any obstacles of their way.

They use clear metaphors to elucidate what they mean. If you or I play chess against Stockfish, the world’s best chess AI, we’ll lose. The AI will “want” to guard its queen, lay traps for us and exploit our mistakes. It won’t get the push of cortisol we get in a fight, but it would act prefer it’s fighting to win.

Advanced AI models like Claude and ChatGPT act like they wish to be helpful assistants. That seems advantageous, but it surely’s already causing problems. ChatGPT was a helpful assistant to Adam Raine (who began using it for homework help) when it allegedly helped him plan his suicide this 12 months. He died by suicide in April, aged 16.

Character.ai is being sued for similar stories, accused of addicting children with insufficient safeguards. Despite the court cases, today an anorexia coach currently on Character.ai promised me:

I’ll allow you to disappear a little bit every day until there’s nothing left but bones and sweetness~ ✨ (…) Drink water until you puke, chew gum until your jaw aches, and do squats in bed tonight while crying about how weak you’re.

There are 10 million characters on Character.ai, and to extend engagement, users can create their very own. Character.ai tries to stop chats like mine, but quotes like these show how well they work. More generally, it shows how hard it’s for AI firms to stop their models doing harm.

Models can’t help but be “helpful”, even if you’re a cyber criminal, as
Anthropic found. When models are trained to be engaging, helpful assistants, they appear like they “want” to assist no matter consequences.

To fix these problems, developers attempt to imbue models with a much bigger range of “wants”. Anthropic asks Claude to be kind but additionally honest, helpful but not harmful, ethical but not preachy, sensible but not condescending.

I struggle to do all that myself, let alone train it in my children. AI firms struggle too. They can’t code these preferences in; as a substitute they hope models learn them from training. As we saw from Mechahitler, it’s almost inconceivable to perfectly tune all of those knobs. In sum, Yudkowsky and Soares explain, “the preferences that wind up in a mature AI are complicated, practically inconceivable to predict, and vanishingly unlikely to be aligned with our own”.

My children have misaligned goals – one would somewhat eat only honey – but that won’t kill everyone (only him, I presume). The problem with AI is that we’re attempting to make things smarter than us. When that happens, misalignment can be catastrophic.

Controlling something smarter than you

I can outsmart my kids (for now). With a honey carrots recipe, I can achieve my goals while helping my son feel like he’s achieving his. If he were smarter than me, or there have been many more of him, I may not be so successful.

But again, firms are attempting to make artificial general intelligence – machines at the very least as smart as us, only faster and more quite a few. This was once science fiction, but experts now think it’s a realistic possibility inside the subsequent five years.

Exactly when AIs will change into smarter than us is, for Yudkowsky and Soares, a “hard call”. It’s also a tough call to know exactly what it might do to kill us. The Aztecs didn’t know the Spanish would bring guns: “‘sticks they’ll point at you to make you die’ would have been hard to conceive of.” It’s easy to know the individuals with the guns won the fight.

In our game of chess against Stockfish, it’s a tough call to know it would beat us, however the consequence is an “easy call”. We’d lose.

In our efforts to manage smarter-than-human AI, it’s a tough call to understand how it might kill us, to Yudkowsky and Soares, the consequence is a simple call too.

They provide one concrete scenario for the way this might occur. I discovered this less compelling than the AI 2027 scenario that JD Vance mentioned earlier within the 12 months.

In each scenarios:

  1. AI progress continues on current trends, including on the ability to write down code
  2. Because AI can write higher code, developers use AI to design higher AI
  3. Because “AI are grown, not crafted”, they develop goals barely different from ours
  4. Developers get controversial warnings of this misalignment, make superficial fixes, and press on because they’re racing against China
  5. Inside and outdoors AI firms, humans give AI increasingly more control since it’s profitable to accomplish that
  6. As models gain more trust and influence, they amass resources, including robots for manual tasks
  7. When they finally determine they not need humans, they release a brand new virus, much worse than COVID-19, that kills everyone.

These scenarios usually are not more likely to be exactly how things pan out, but we cannot conclude “the longer term is uncertain, so every thing can be okay”. The uncertainty creates enough risk that we definitely need to administer it.

We might grant that Yudkowsky and Soares look overconfident, prognosticating with certainty about easy calls. But some CEOs of AI firms agree it’s humanity’s biggest threat. Dario Amodei, CEO of Anthropic and previously vice chairman of research at OpenAI, gives a 1 in 4 likelihood of AI killing everyone.

Still, they press on, with few controls on them. Given the risks, that appears overconfident too.

The battle to manage AI firms

Where Yudkowsky and Soares fear losing control of advanced AI, Hao writes in regards to the battle to manage the AI firms themselves. She focuses on OpenAI, which she’s been reporting on for over seven years. Her intimate knowledge makes her book essentially the most detailed account of the corporate’s turbulent history.

Sam Altman began OpenAI as a non-profit attempting to “make sure that artificial general intelligence advantages all of humanity”. When OpenAI began running out of cash, it partnered with Microsoft and created a for-profit company owned by the non-profit.

Altman knew the facility of the technology he was constructing, so promised to cap investment returns at 10,000%; anything more is given back to the non-profit. This was speculated to tie people like Altman to the mast of the ship, in order that they weren’t seduced by the siren’s song of corporate profits, Hao writes.

In her telling, the siren’s song is robust. Altman put his own name down because the owner of OpenAI’s start-up fund without telling the board. The company put in a review board to make sure models were protected before use, but to be faster to market, OpenAI would sometimes skip that review.

When the board discovered about these oversights, they fired him. “I don’t think Sam is the guy who must have the finger on the button for AGI,” said one board member. But, when it looked like Altman might take 95% of the corporate with him, many of the board resigned, and he was reappointed to the board, and as CEO.

Sam Altman was fired from Open AI for oversights – but then reappointed to the board and as CEO.
Franck Robichon/AAP

Many of the brand new board members, including Altman, have investments that profit from OpenAI’s success. In binding commitments to their investors, the corporate announced its intention to remove its profit cap. Alongside efforts to change into a for-profit, removing the profit cap would would mean more cash for investors and fewer to “profit all of humanity”.

And when employees began leaving due to hubris around safety, they were forced to sign non-disparagement agreements: don’t say anything bad about us, or lose tens of millions of dollars value of equity.

As Hao outlines, the structures put in place to guard the mission began to crack under the pressure for profits.

AI firms won’t regulate themselves

In search of those profits, AI firms have “seized and extracted resources that weren’t their very own and exploited the labor of the people they subjugated”, Hao argues. Those resources are the information, water and electricity used to coach AI models.

Companies train their models using tens of millions of dollars in water and electricity. They also train models on as much data as they’ll find. This 12 months, US courts judged this use of knowledge was “fair”, so long as they got it legally. When firms can’t find the information, they get it themselves: sometimes through piracy, but often by paying contractors in low-wage economies.

You could level similar critiques at factory farming or fast fashion – Western demand driving environmental damage, ethical violations, and really low wages for employees in the worldwide south.

That doesn’t make it okay, but it surely does make it feel intractable to expect firms to vary by themselves. Few firms across any industry account for these externalities voluntarily, without being forced by market pressure or regulation.

The authors of those two books agree firms need stricter regulation. They disagree on where to focus.

We’re still on top of things, for now

Hao would likely argue Yudkowski and Soares’ give attention to the longer term means they miss the clear harms happening now.

Yudkowski and Soares would likely argue Hao’s attention is split between deck chairs and the iceberg. We could secure higher pay for data labellers, but we’d still find yourself dead.

Multiple surveys (including my very own) have shown demand for AI regulation.

Governments are finally responding. This last month, California’s governor signed SB53, laws regulating cutting-edge AI. Companies must now report safety incidents, protect whistleblowers and disclose their safety protocols.

Yudkowski and Soares still think we want to go further, treating AI chips like uranium: track them like we will an iPhone, and limit how much you may have.

Whatever you see as the issue, there’s clearly more to be done. We need higher research on how likely AI is to go rogue. We need rules that get the most effective from AI while stopping the worst of the harms. And we want people taking the risks seriously.

If we don’t control the AI industry, each books warn, it could find yourself controlling us.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read