DeepMind demo SIMA, a generalist AI agent for 3D environments

March 15, 2024

110

Imagine an AI that not only understands commands, but applies them like a human in a series of simulated 3D environments.

This is the goal of DeepMind (Scalable, Instructable, Multiworld Agent (SIMA).

Unlike traditional AI, which could excel at individual tasks equivalent to strategic games or solving specific problems, SIMA's agents are trained to interpret instructions in human language and translate them into actions using a keyboard and mouse, thereby improving the imitating human interaction with a pc.

This implies that SIMA goals to grasp and execute these commands with the identical intuition and flexibility, whether it's navigating a digital landscape, solving puzzles, or interacting with objects in a game, like a person would do it.

Introducing SIMA: the primary generalist AI agent that follows natural language instructions in a wide selection of 3D virtual environments and video games. 🕹️

It can perform tasks just like a human, outperforming an agent trained in only one environment. 🧵 https://t.co/qz3IxzUpto pic.twitter.com/02Q6AkW4uq

– Google DeepMind (@GoogleDeepMind) March 13, 2024

At the core of this project is an unlimited and diverse dataset of human gameplay in research environments and business video games.

SIMA has been trained and tested on a number of nine video games through collaboration with eight game studios, including well-known titles equivalent to No Man's Sky and Teardown. Each game challenges SIMA with different skills, from easy navigation and resource gathering to more complex activities equivalent to crafting and spaceship piloting.

SIMA's training included 4 research environments to judge its physical interaction and object manipulation skills.

In terms of architecture, SIMA uses pre-trained vision and video prediction models which are fine-tuned to the particular 3D settings of its gaming portfolio.

Unlike traditional game AIs, SIMA doesn’t require access to source code or custom APIs. It serves screen images and user-provided instructions and uses keyboard and mouse actions to perform tasks.

In its evaluation phase, SIMA demonstrated proficiency in 600 basic skills, including navigation, object interaction, and menu usage.

What sets SIMA apart is its universality. This AI is just not trained to master a single game or solve a particular set of problems.

Instead, DeepMind teaches it to be adaptable, understand instructions and act accordingly in several virtual worlds.

DeepMind's Tim Harley explained: “It's still a research project,” but in the long run “one could imagine agents like SIMA someday playing alongside you and your mates in games.”

SIMA only requires the photographs provided by the 3D environment and natural language instructions provided by the user. 🖱️

Mouse and keyboard output assesses 600 skills, covering areas equivalent to navigation and object interaction – equivalent to “turning left” or “cutting down a tree”…. pic.twitter.com/PEPfLZv2o0

– Google DeepMind (@GoogleDeepMind) March 13, 2024

SIMA masters the art of understanding our instructions and acting accordingly by anchoring language in perception and motion.

DeepMind has an intensive gaming legacy dating back to 2014's AlphaGo, which defeated several high-profile players of the famously complex Asian game Go.

However, SIMA goes deeper than video games and gets closer to the dream of truly intelligent, instructable AI agents that blur the lines between human and machine understanding.

DeepMind demo SIMA, a generalist AI agent for 3D environments

LEAVE A REPLY Cancel reply

Must Read

Google releases technology to watermark AI-generated text

Nuclear energy stocks hit record highs on rising demand for AI

The governor of California has blocked groundbreaking AI security laws. This is why it’s such a very important decision for the longer term of...

Contactless stores set to grow in Europe as Sensei rakes in one other $16 million

AI search start-up Perplexity is targeting an $8 billion valuation in a brand new round of funding

Socket receives recent $40 million to scan software for security vulnerabilities

Cohere adds a vision to its RAG search capabilities

Latest articles

Google releases technology to watermark AI-generated text

Nuclear energy stocks hit record highs on rising demand for AI

The governor of California has blocked groundbreaking AI security laws. This is why it’s such a very important decision for the longer term of...

Our Newsletter

DeepMind demo SIMA, a generalist AI agent for 3D environments

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter