Last week, leading AI scientists met on the Second International Dialogue on AI Security in Beijing to agree on “red lines” for AI development to mitigate existential risks.
The list of computer scientists included notable names akin to Turing Prize winners Yoshua Bengio and Geoffrey Hinton, also known as the “godfathers” of AI, and Andrew Yao, one in every of China's most famous computer scientists.
Explaining the urgent need for international discussions to curb AI development, Bengio said: “Science doesn't know the best way to make sure that these future AI systems, which we call AGI, are secure.” We should start now to work on scientific and political solutions to this problem.”
In a joint statement signed by the scientists, they clearly expressed their unease concerning the risks of AI and the necessity for international dialogue.
The statement said: “In the depths of the Cold War, international scientific and government coordination helped avert a thermonuclear catastrophe. Humanity must once more coordinate to avert a catastrophe that might arise from unprecedented technology.”
AI red lines
The list of AI development red lines, described within the statement as “non-exhaustive,” includes the next:
Autonomous replication or improvement – No AI system should give you the option to repeat or improve itself without explicit human consent and support. This includes each exact copies of itself and the creation of recent AI systems with similar or greater capabilities.
Power looking for – No AI system should take actions to unduly increase its power and influence.
Support in weapons development – No AI systems should significantly improve actors’ ability to develop weapons of mass destruction (WMD) or violate the Biological or Chemical Weapons Convention.
Cyber attacks – No AI system should give you the option to independently perform cyberattacks that end in serious financial loss or equivalent damage.
illusion – No AI system should give you the option to consistently trick its developers or regulators into misjudging its likelihood or ability to cross any of the above red lines.
These sound like good ideas, but is that this global wish list for AI development realistic? The scientists were confident of their statement: “It is feasible to make sure that these red lines aren’t crossed, but a concerted effort is required to develop each improved governance systems and technical security methods.”
Anyone taking a more fatalistic take a look at the list might conclude that a few of these AI horses have already bolted. Or are about to.
Autonomous replication or improvement? How long until an AI coding tool like Devin can do that?
Looking for power? Did these scientists read a number of the crazy things Copilot said when the movie went off script and choose he must be revered?
When it involves helping develop weapons of mass destruction or automating cyberattacks, it could be naive to imagine that China and Western powers aren’t already doing this.
As for deception, some AI models like Claude 3 Opus have already suggested that they know after they are being tested during training. If an AI model were to cover its intention to cross one in every of these red lines, would we give you the option to detect it?
Notably absent from the discussions were representatives from the e/acc side of the AI doomsday group, akin to Meta's chief AI scientist Yann LeCun.
Last yr, LeCun said that the concept AI posed an existential threat to humanity was “absurdly ridiculous” and agreed with Marc Andreesen's statement that “AI will save the world, not kill it.”
Let's hope they're right. Because these red lines are unlikely to stay unmatched.