Computer Science 5th Years Master's Thesis Presentation

— 2:00pm

Location:
In Person and Virtual - ET - Gates Hillman 7501 and Zoom

Speaker:
ADITYA KANNAN , Master's Student, Computer Science Department, Carnegie Mellon University
https://adityak77.github.io/

Learning from Human Videos for Robotic Manipulation

In recent years, many works in Computer Vision and NLP have demonstrated remarkable steps toward generalization through the collection and use of diverse datasets. However, collecting large-scale robot datasets is often difficult due to many reasons including cost, reliance on human supervision, and safety. An alternative approach is to take advantage of the accessibility and wide variety of human videos available on the internet. In this thesis, we investigate two approaches that use human videos for robotic control without relying on robot demonstrations.

In our first work, we use human videos as a prior for dexterous manipulation. Humans are able to perform a host of skills with their hands, from making food to operating tools. In this work, we investigate these challenges, especially in the case of soft, deformable objects as well as complex, relatively long-horizon tasks. However, learning such behaviors from scratch can be data inefficient. To circumvent this, we propose a novel approach, DEFT (DExterous Fine-Tuning for Hand Policies), that leverages human-driven priors, which are executed directly in the real world. In order to improve upon these priors, DEFT involves an efficient online optimization procedure. With the integration of human-based learning and online fine-tuning, coupled with a soft robotic hand, DEFT demonstrates success across various tasks, establishing a robust, data-efficient pathway toward general dexterous manipulation.

In our second work, we introduce a method to learn a domain- and agent-agnostic reward function from large-scale egocentric human data. Prior approaches that use human data for reward learning either require a small sample of in-domain robot data in training or need a goal image specified in the robot's environment. In this work, we focus on the setting where only human data is available at training and test time. Our approach trains a multi-task reward function that learns to discriminate between different tasks by observing the changes in the environment.  We show that our method has strong performance on three simulation tasks without the help of robot demonstrations in training or in-domain goals.

Thesis Committee:

Deepak Pathak (Chair)
Abhinav Gupta

Additional Information

In Person and Zoom Participation. See announcement.


Add event to Google
Add event to iCal