New work from Carnegie Mellon University has enabled robots to learn household chores by watching videos of people performing everyday tasks in their homes.
Robots Learning Household Chores Through Videos
The latest research from Carnegie Mellon University (CMU) has made significant advancements in enabling robots to learn household chores by simply watching videos of people performing everyday tasks in their homes. This groundbreaking development has the potential to greatly improve the utility of robots in the home, allowing them to assist people with tasks such as cooking and cleaning.
The researchers at CMU successfully trained two robots to learn 12 different household tasks by watching videos. These tasks included opening a drawer, oven door, and lid, taking a pot off the stove, and picking up items such as a phone, vegetable, or can of soup. By observing people in videos, the robots were able to learn where and how humans interacted with different objects, which enabled them to complete similar tasks in various environments.
Traditionally, robot training methods have required manual demonstration of tasks by humans or extensive training in simulated environments. However, these methods are time-consuming and often prone to failure. Previous research by CMU’s Robotics Institute introduced WHIRL (In-the-Wild Human Imitating Robot Learning), a method in which robots learn by watching humans complete tasks. However, WHIRL still required the human to perform the task in the same environment as the robot.
Introducing the Vision-Robotics Bridge (VRB)
Building on the success of WHIRL, CMU’s Deepak Pathak and his team developed the Vision-Robotics Bridge (VRB). VRB eliminates the need for human demonstrations and the requirement for the robot to operate in an identical environment. The robots can now learn tasks by watching videos of humans in various settings and adapt their knowledge to complete similar tasks in different environments.
While the robots still require practice to master a task, the team’s research shows that they can learn a new task in as little as 25 minutes. This remarkable progress opens up a world of possibilities for robots to navigate and perform tasks autonomously.
Benefits and Capabilities: Teaching Robots to Interact
In order to teach the robots how to interact with objects, CMU’s research team applied the concept of benefits. Benefits, originating from psychology, refer to what an environment offers to an individual. In the context of VRB, benefits define where and how a robot can interact with an object based on human behavior.
For example, when a robot watches a video of a human opening a drawer, it can identify the contact points, such as the handle, and the direction of movement required to open the drawer. By observing several videos of humans opening drawers, the robot can develop an understanding of how to open any type of drawer.
To gather video data for training, the team utilized large datasets like Ego4D and Epic Kitchens. Ego4D contains nearly 4000 hours of egocentric videos from around the world, showcasing daily activities. CMU researchers were involved in compiling some of these videos. Epic Kitchens features similar videos capturing cooking, cleaning, and other kitchen-related chores. These datasets are instrumental in training computer vision models.
The team’s innovative use of these datasets allows robots to learn from the vast amount of videos available on the internet and platforms like YouTube. This breakthrough has the potential to revolutionize the way robots acquire knowledge and enhance their interaction capabilities.
Implications and Future Possibilities
The research conducted by CMU’s Robotics Institute holds significant implications for the future of robotics in domestic settings. With the ability to learn through videos, robots can now navigate and assist with household tasks more effectively.
Imagine a robot that can learn how to cook a new recipe simply by watching a video of a human chef preparing the dish. Or a robot that can clean different types of surfaces by observing videos of people cleaning various environments. These advancements have the potential to significantly enhance the role of robots in our daily lives, making them more intuitive and adaptable to different contexts.
Looking ahead, this research paves the way for further developments in the field of robot learning and human-robot interaction. Improved capabilities in understanding and imitating human behavior will open doors to more advanced and autonomous robots that can seamlessly integrate into our homes and assist us with a wide range of tasks.
—————————————————-
Article | Link |
---|---|
UK Artful Impressions | Premiere Etsy Store |
Sponsored Content | View |
90’s Rock Band Review | View |
Ted Lasso’s MacBook Guide | View |
Nature’s Secret to More Energy | View |
Ancient Recipe for Weight Loss | View |
MacBook Air i3 vs i5 | View |
You Need a VPN in 2023 – Liberty Shield | View |
New work from Carnegie Mellon University has enabled robots to learn household chores by watching videos of people performing everyday tasks in their homes.
The research could help improve the utility of robots in the home, allowing them to help people with tasks like cooking and cleaning. Two robots successfully learned 12 tasks, including opening a drawer, oven door, and lid; take a pot off the stove; and pick up a phone, a vegetable or a can of soup.
“The robot can learn where and how humans interact with different objects by watching videos,” said Deepak Pathak, an assistant professor in the Robotics Institute in CMU’s School of Computer Science. “From this knowledge, we can train a model that allows two robots to complete similar tasks in varied environments.”
Current robot training methods require manual demonstration of tasks by humans or extensive training in a simulated environment. Both are time consuming and prone to failure. Previous research by Pathak and his students demonstrated a novel method in which robots learn by watching humans complete tasks. However, WHIRL, short for In-the-Wild Human Imitating Robot Learning, required the human to complete the task in the same environment as the robot.
Pathak’s latest work, the Vision-Robotics Bridge, or VRB for short, builds on and improves on WHIRL. The new model eliminates the need for human demonstrations, as well as the need for the robot to operate in an identical environment. Like WHIRL, the robot still requires practice to master a task. The team’s research showed that it can learn a new task in as little as 25 minutes.
“We were able to take robots around campus and do all sorts of tasks,” said Shikhar Bahl, Ph.D. robotics student. “Robots can use this model to curiously explore the world around them. Instead of just waving their arms around, a robot can be more direct with how they interact.”
To teach the robot how to interact with an object, the team applied the concept of benefits. Benefits have their roots in psychology and refer to what an environment offers to an individual. The concept has been extended to design and human-computer interaction to refer to potential actions perceived by an individual.
For VRB, capabilities define where and how a robot can interact with an object based on human behavior. For example, when a robot watches a human open a drawer, it identifies the contact points, the handle, and the direction of movement of the drawer, directly from the initial location. After watching several videos of humans opening drawers, the robot can determine how to open any drawer.
The team used videos from large data sets like Ego4D and Epic Kitchens. Ego4D has almost 4000 hours of egocentric videos of daily activities from around the world. CMU researchers helped compile some of these videos. Epic Kitchens features similar videos that capture cooking, cleaning, and other kitchen chores. Both data sets are intended to help train computer vision models.
“We are using these data sets in a new and different way,” Bahl said. “This work could allow robots to learn from the vast amount of videos available on the Internet and YouTube.”
More information is available on the project website and in a paper presented in June at the Conference on Vision and Pattern Recognition.
https://www.sciencedaily.com/releases/2023/06/230620113807.htm
—————————————————-