Imagine purchasing a robot at an enormous amount to perform household tasks but failing to carry out such tasks; how will you feel? For instance, if you ask it to pick up a mug from your kitchen table, it might not recognize your mug. This is because the robot was trained and built in a factory on a certain set of tasks and has never seen the items in your home before.
To address this, Peng and her colleagues at the Massachusetts Institute of Technology (MIT) and other researchers from New York University and the University of California at Berkeley created a framework that enables humans to quickly teach a robot what they want it to do, with a minimal amount of effort.
Joining Peng as co-researchers are Aviv Netanyahu, an EECS graduate student; Mark Ho, an assistant professor at the Stevens Institute of Technology; Tianmin Shu, an MIT postdoc; Andreea Bobu, a graduate student at UC Berkeley; and senior authors Julie Shah an MIT professor of aeronautics and astronautics and the director of the Interactive Robotics Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Pulkit Agrawal, a professor in CSAIL.
Data Augmentation: A New Technique to Train Robot That Failed to Complete A Task
The researchers developed a new technique called data augmentation which involves tweaking a machine-learning model that has already been trained to perform one task so that it can perform a second, similar task. The researchers tested this technique in simulations and found that it could teach a robot more efficiently than other methods.
Without requiring a user to possess technical knowledge, this approach might aid robots in learning more quickly in unfamiliar contexts. This could eventually pave the way for general-purpose robots to effectively carry out daily duties for the elderly or people with disabilities in a variety of contexts.
Imitation learning is one method for retraining a robot to perform a certain task. To educate the robot on what to perform, the user could execute the proper task. If a user instructs a robot to pick up a mug but only uses a white mug as a demonstration, the robot might assume that all mugs are white. A “Tim-the-Beaver-brown” cup, a red mug, or a blue mug might then not be picked up.
To do this, the researchers’ system ascertains which particular object the user is concerned about (a cup) and which components aren’t necessary for the task (for example, the mug’s color may not be relevant). By altering these “unimportant” visual notions, it exploits this data to create new, synthetic data. This method is referred to as data augmentation.
Three Steps to Fine-tune A Robot That Failed to Complete A Desired
The framework has three steps. It first displays the failure-causing task for the robot. Then, using the user’s demonstration of the intended activities, it creates counterfactuals by looking over all of the space’s features to determine what would have to change in order for the robot to be successful.
The system asks the user for comments after displaying these counterfactuals in order to identify which visual ideas do not influence the desired behavior. Then, it creates a large number of fresh augmented examples using this human feedback.
By changing the color, the system would create examples exhibiting the desired action with thousands of distinct mugs. In this way, the user may demonstrate picking up one mug. It adjusts the robot using this data.