HomeNewsPrecise home robots learn with real-to-sim-to-real

Precise home robots learn with real-to-sim-to-real

At the highest of many automation wish lists is a very time-consuming task: housekeeping.

The goal of many roboticists is to develop the suitable combination of hardware and software to enable a machine to learn “generalist” policies (the foundations and methods that govern robot behavior) that work anywhere and under any conditions. But realistically, you most likely don't care whether or not a robot in your house works on your neighbors. So researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) decided to search out an answer that might allow robots to simply learn robust policies for very specific environments.

“Our goal is to develop robots that perform exceptionally well under disturbances, distractions, different lighting conditions, and changes in object positions, all in a single environment,” says Marcel Torne Villasevil, research assistant at MIT's CSAIL within the Improbable AI lab and lead writer of a recent Paper in regards to the work. “We propose a technique to create digital twins on the fly by leveraging the newest advances in computer vision. With only a phone, anyone can capture a digital replica of the actual world, and the robots can train much faster in a simulated environment than in the actual world because of GPU parallelization. Our approach eliminates the necessity for extensive reward engineering through the use of just a few real-world demonstrations to kickstart the training process.”

Take your robot home

RialTo is, after all, a bit more complicated than simply an easy wave of the phone and (boom!) the Home-Bot is at your disposal. First, your device is used to scan the goal environment using tools like NeRFStudio, ARCode or Polycam. Once the scene is reconstructed, users can upload it to the RialTo interface to make detailed adjustments, add the crucial joints to the robots and more.

The refined scene is exported and brought into the simulator. The goal is to develop a technique based on real-world actions and observations, resembling grabbing a cup on a counter. These real-world demonstrations are recreated within the simulation and supply precious data for reinforcement learning. “This helps develop a robust strategy that works well in each the simulation and the actual world. An improved reinforcement learning algorithm helps guide this process to be sure that the strategy is effective outside of the simulator,” says Torne.

Tests showed that RialTo developed strong policies for quite a lot of tasks, whether in controlled lab settings or more unpredictable real-world environments, performing 67 percent higher than imitation learning over the identical variety of demonstrations. The tasks included opening a toaster, placing a book on a shelf, placing a plate on a rack, placing a cup on a shelf, opening a drawer, and opening a cupboard. For each task, researchers tested the system's performance at three increasing levels of difficulty: randomizing object positions, adding visual distractions, and applying physical perturbations during task performance. When coupled with real-world data, the system outperformed traditional imitation learning methods, especially in situations with many visual distractions or physical perturbations.

“These experiments show that if we value being very robust to a selected environment, the most effective idea is to make use of digital twins fairly than trying to realize robustness through large-scale data collection in several environments,” says Pulkit Agrawal, director of the Improbable AI Lab, associate professor of electrical engineering and computer science (EECS) at MIT, principal investigator at MIT CSAIL, and lead writer of the paper.

As for limitations, RialTo currently takes three days to completely train. To speed this up, the team mentions improving the underlying algorithms and using base models. Training in simulation also has its limitations, and currently it’s difficult to effortlessly transfer from simulation to the actual world and simulate deformable objects or fluids.

The next level

So what's next for RialTo? Building on previous efforts, the scientists are working to keep up robustness to varied perturbations while improving the model's adaptability to latest environments. “Our next endeavor is that this approach of using pre-trained models, accelerating the training process, minimizing human input and achieving broader generalization capabilities,” says Torne.

“We are incredibly enthusiastic about our concept of 'on-the-fly' robot programming, where robots can autonomously scan their environment and learn learn how to solve specific tasks in simulation. Although our current method has limitations – resembling requiring some initial demonstrations by a human and requiring significant computational time to coach these policies (up to 3 days) – we see it as a big step towards 'on-the-fly' robot learning and deployment,” says Torne. “This approach brings us closer to a future where robots don’t need a predefined policy that covers every scenario. Instead, they will quickly learn latest tasks without extensive interaction with the actual world. In my view, this advancement could advance the sensible application of robotics much faster than relying solely on a universal, all-encompassing policy.”

“To deploy robots in the actual world, researchers have traditionally relied on methods resembling imitation learning from expert data, which may be expensive, or reinforcement learning, which may be unsafe,” says Zoey Chen, a pc science doctoral student on the University of Washington who was not involved within the work. “RialTo, with its novel real-to-sim-to-real pipeline, directly addresses each the protection constraints of real-world RL (robot learning) and efficient data constraints for data-driven learning methods. This novel pipeline not only ensures protected and robust training in simulation before deployment in the actual world, but in addition greatly improves the efficiency of knowledge collection. RialTo has the potential to significantly scale robot learning and enables robots to adapt to complex real-world scenarios way more effectively.”

“Simulation has shown impressive capabilities on real robots by providing inexpensive, potentially infinite data for strategy learning,” adds Marius Memmel, a doctoral student in computer science on the University of Washington who was not involved within the work. “However, these methods are limited to just a few specific scenarios, and creating the corresponding simulations is dear and laborious. RialTo provides a user-friendly tool that may reconstruct real-world environments in minutes fairly than hours. In addition, it makes use of demonstrations collected at scale during strategy learning, minimizing operator burden and narrowing the Sim2Real gap. RialTo demonstrates robustness to object positions and perturbations, and shows incredible real-world performance without the necessity for extensive simulator construction and data collection.”

Torne co-authored this paper with lead authors Abhishek Gupta, assistant professor on the University of Washington, and Agrawal. Four other CSAIL members are also credited: EECS graduate student Anthony Simeonov SM '22, research assistant Zechu Li, graduate student April Chan, and Tao Chen PhD '24. Members of the Improbable AI Lab and the WEIRD Lab also provided precious feedback and support in the event of this project.

This work was supported partly by the Sony Research Award, the U.S. government, and Hyundai Motor Co., with support from the WEIRD (Washington Embodied Intelligence and Robotics Development) Lab. The researchers presented their work earlier this month on the Robotics Science and Systems (RSS) conference.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read