Robot See, Robot Do: On the Development of Robust and Adaptive Imitation Learning for Robots
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As robots transition from isolated industrial settings to working in close proximity to humans in dynamic environments, their ability to learn and adapt to human feedback and unseen circumstances becomes crucial. Imitation learning offers a promising paradigm for robots to learn complex tasks by mimicking human behavior. However, traditional imitation learning approaches face key challenges in integrating diverse feedback types, managing noisy and inconsistent inputs, and maintaining stability in learning. In this thesis, we develop imitation learning approaches that advance the capabilities of robots and enable them to efficiently learn from humans and adapt to unseen data in diverse environments.
This research is structured around four key contributions. First, we consider the scenario where a human is readily available to provide high-quality feedback to the robot. We develop a learning algorithm to enable robots to learn from diverse sources of optimal human feedback: demonstrations, corrections, and preferences. Demonstrations provide high-level task overviews, corrections fine-tune specific motions, and preferences rank robot behaviors for task improvement. By incorporating these active and passive feedback sources under a unified reward learning framework we enable robots to infer task objectives more effectively and optimize their trajectories using constrained optimization techniques.
Second, we explore scenarios where the human feedback is noisy or biased due to task complexity or physical constraints. We model the robot's learning rule as a dynamical system and apply Lyapunov stability analysis to derive conditions of convergence. Leveraging these conditions, we modify the robot's learning rule to expand the basins of attractions around the possible tasks (equilibrium points) in the environment. This approach enables the robot to infer the correct task representations from a wider range of human inputs making the learning robust to suboptimal feedback without destabilizing the robot behavior.
Next, we consider imitation learning settings where a human is not available to provide additional feedback. In such scenarios, imitation learning algorithms are often prone to covariate shift when they encounter data not seen during training. To tackle this challenge, we develop Stable Behavior Cloning (Stable-BC), a stability-driven imitation learning algorithm. This algorithm ensures that robots maintain reliable performance by encouraging policy stability around demonstrated behaviors without the need for additional training data or complex reinforcement learning methods.
Finally, we look at the problem of imitation learning from the users' perspective and aim to reduce the time and effort required to teach the robot. We propose L2D2, a sketching interface and imitation learning algorithm where humans can provide demonstrations by drawing the task. L2D2 leverages vision-language segmentation to autonomously vary object locations and generates synthetic images of the environment for the human to draw upon. By collecting a few physical demonstrations from the users, L2D2 then grounds these diverse 2D drawings in the real world. This approach reduces the time and effort required to teach the robots by enabling the users to rapidly provide a large set of diverse demonstrations.
The findings from this research highlight the importance of adaptability and stability when robots and autonomous agents work around and interact with humans in diverse environments. This research contributes to the broader field of robot learning by offering scalable, adaptable, and user-friendly solutions for imitation learning and human-robot interaction, paving the way for more intuitive and robust robotic systems in human environments.