Browsing by Author "Losey, Dylan P."
Now showing 1 - 13 of 13
Results Per Page
Sort Options
- Communicating Inferred Goals With Passive Augmented Reality and Active Haptic FeedbackMullen, James F.; Mosier, Josh; Chakrabarti, Sounak; Chen, Anqi; White, Tyler; Losey, Dylan P. (IEEE, 2021-10-01)Robots learn as they interact with humans. Consider a human teleoperating an assistive robot arm: as the human guides and corrects the arm's motion, the robot gathers information about the human's desired task. But how does the human know what their robot has inferred? Today's approaches often focus on conveying intent: for instance, using legible motions or gestures to indicate what the robot is planning. However, closing the loop on robot inference requires more than just revealing the robot's current policy: the robot should also display the alternatives it thinks are likely, and prompt the human teacher when additional guidance is necessary. In this letter we propose a multimodal approach for communicating robot inference that combines both passive and active feedback. Specifically, we leverage information-rich augmented reality to passively visualize what the robot has inferred, and attention-grabbing haptic wristbands to actively prompt and direct the human's teaching. We apply our system to shared autonomy tasks where the robot must infer the human's goal in real-time. Within this context, we integrate passive and active modalities into a single algorithmic framework that determines when and which type of feedback to provide. Combining both passive and active feedback experimentally outperforms single modality baselines; during an in-person user study, we demonstrate that our integrated approach increases how efficiently humans teach the robot while simultaneously decreasing the amount of time humans spend interacting with the robot. Videos here: https://youtu.be/swq_u4iIP-g
- Here's What I've Learned: Asking Questions that Reveal Reward LearningHabibian, Soheil; Jonnavittula, Ananth; Losey, Dylan P. (ACM, 2022-09-08)Robots can learn from humans by asking questions. In these questions the robot demonstrates a few different behaviors and asks the human for their favorite. But how should robots choose which questions to ask? Today's robots optimize for informative questions that actively probe the human's preferences as efficiently as possible. But while informative questions make sense from the robot's perspective, human onlookers may find them arbitrary and misleading. In this paper we formalize active preference-based learning from the human's perspective. We hypothesize that --- from the human's point-of-view --- the robot's questions reveal what the robot has and has not learned. Our insight enables robots to use questions to make their learning process transparent to the human operator. We develop and test a model that robots can leverage to relate the questions they ask to the information these questions reveal. We then introduce a trade-off between informative and revealing questions that considers both human and robot perspectives: a robot that optimizes for this trade-off actively gathers information from the human while simultaneously keeping the human up to date with what it has learned. We evaluate our approach across simulations, online surveys, and in-person user studies.
- Here’s What I’ve Learned: Asking Questions that Reveal Reward LearningHabibian, Soheil; Jonnavittula, Ananth; Losey, Dylan P. (Virginia Tech, 2021-07-02)Robots can learn from humans by asking questions. In these questions the robot demonstrates a few different behaviors and asks the human for their favorite. But how should robots choose which questions to ask? Today’s robots optimize for informative questions that actively probe the human’s preferences as efficiently as possible. But while informative questions make sense from the robot’s perspective, human onlookers often find them arbitrary and misleading. For example, consider an assistive robot learning to put away the dishes. Based on your answers to previous questions this robot knows where it should stack each dish; however, the robot is unsure about right height to carry these dishes. A robot optimizing only for informative questions focuses purely on this height: it shows trajectories that carry the plates near or far from the table, regardless of whether or not they stack the dishes correctly. As a result, when we see this question, we mistakenly think that the robot is still confused about where to stack the dishes! In this paper we formalize active preference-based learning from the human’s perspective. We hypothesize that — from the human’s point-of-view — the robot’s questions reveal what the robot has and has not learned. Our insight enables robots to use questions to make their learning process transparent to the human operator.We develop and test a model that robots can leverage to relate the questions they ask to the information these questions reveal. We then introduce a trade-off between informative and revealing questions that considers both human and robot perspectives: a robot that optimizes for this trade-off actively gathers information from the human while simultaneously keeping the human up to date with what it has learned. We evaluate our approach across simulations, online surveys, and in-person user studies. We find that robots which consider the human’s point of view learn just as quickly as state-of-the-art baselines while also communicating what they have learned to the human operator. Videos of our user studies and results are available here: https://youtu.be/tC6y_jHN7Vw.
- I Know What You Meant: Learning Human Objectives by (Under)estimating Their Choice SetJonnavittula, Ananth; Losey, Dylan P. (Virginia Tech, 2021-04-05)Assistive robots have the potential to help people perform everyday tasks. However, these robots first need to learn what it is their user wants them to do. Teaching assistive robots is hard for inexperienced users, elderly users, and users living with physical disabilities, since often these individuals are unable to show the robot their desired behavior. We know that inclusive learners should give human teachers credit for what they cannot demonstrate. But today’s robots do the opposite: they assume every user is capable of providing any demonstration. As a result, these robots learn to mimic the demonstrated behavior, even when that behavior is not what the human really meant! Here we propose a different approach to reward learning: robots that reason about the user’s demonstrations in the context of similar or simpler alternatives. Unlike prior works — which err towards overestimating the human’s capabilities — here we err towards underestimating what the human can input (i.e., their choice set). Our theoretical analysis proves that underestimating the human’s choice set is risk-averse, with better worst-case performance than overestimating. We formalize three properties to generate similar and simpler alternatives. Across simulations and a user study, our resulting algorithm better extrapolates the human’s objective. See the user study here: https://youtu.be/RgbH2YULVRo.
- Learning Human Objectives from Sequences of Physical CorrectionsLi, Mengxi; Canberk, Alper; Losey, Dylan P.; Sadigh, Dorsa (IEEE, 2021-05-30)When personal, assistive, and interactive robots make mistakes, humans naturally and intuitively correct those mistakes through physical interaction. In simple situations, one correction is sufficient to convey what the human wants. But when humans are working with multiple robots or the robot is performing an intricate task often the human must make several corrections to fix the robot’s behavior. Prior research assumes each of these physical corrections are independent events, and learns from them one-at-a-time. However, this misses out on crucial information: each of these interactions are interconnected, and may only make sense if viewed together. Alternatively, other work reasons over the final trajectory produced by all of the human’s corrections. But this method must wait until the end of the task to learn from corrections, as opposed to inferring from the corrections in an online fashion. In this paper we formalize an approach for learning from sequences of physical corrections during the current task. To do this we introduce an auxiliary reward that captures the human’s trade-off between making corrections which improve the robot’s immediate reward and long-term performance. We evaluate the resulting algorithm in remote and in-person human-robot experiments, and compare to both independent and final baselines. Our results indicate that users are best able to convey their objective when the robot reasons over their sequence of corrections.
- Learning latent actions to control assistive robotsLosey, Dylan P.; Jeon, Hong Jun; Li, Mengxi; Srinivasan, Krishnan; Mandlekar, Ajay; Garg, Animesh; Bohg, Jeannette; Sadigh, Dorsa (Springer, 2021-08-04)Assistive robot arms enable people with disabilities to conduct everyday tasks on their own. These arms are dexterous and high-dimensional; however, the interfaces people must use to control their robots are low-dimensional. Consider teleoperating a 7-DoF robot arm with a 2-DoF joystick. The robot is helping you eat dinner, and currently you want to cut a piece of tofu. Today’s robots assume a pre-defined mapping between joystick inputs and robot actions: in one mode the joystick controls the robot’s motion in the x–y plane, in another mode the joystick controls the robot’s z–yaw motion, and so on. But this mapping misses out on the task you are trying to perform! Ideally, one joystick axis should control how the robot stabs the tofu, and the other axis should control different cutting motions. Our insight is that we can achieve intuitive, user-friendly control of assistive robots by embedding the robot’s high-dimensional actions into low-dimensional and human-controllable latent actions. We divide this process into three parts. First, we explore models for learning latent actions from offline task demonstrations, and formalize the properties that latent actions should satisfy. Next, we combine learned latent actions with autonomous robot assistance to help the user reach and maintain their high-level goals. Finally, we learn a personalized alignment model between joystick inputs and latent actions. We evaluate our resulting approach in four user studies where non-disabled participants reach marshmallows, cook apple pie, cut tofu, and assemble dessert. We then test our approach with two disabled adults who leverage assistive devices on a daily basis.
- Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferencesBıyık, Erdem; Losey, Dylan P.; Palan, Malayandi; Landolfi, Nicholas C.; Shevchuk, Gleb; Sadigh, Dorsa (SAGE, 2022-01)Reward functions are a common way to specify the objective of a robot. As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers. Importantly, data from human teachers can be collected either passively or actively in a variety of forms: passive data sources include demonstrations (e.g., kinesthetic guidance), whereas preferences (e.g., comparative rankings) are actively elicited. Prior research has independently applied reward learning to these different data sources. However, there exist many domains where multiple sources are complementary and expressive. Motivated by this general problem, we present a framework to integrate multiple sources of information, which are either passively or actively collected from human users. In particular, we present an algorithm that first utilizes user demonstrations to initialize a belief about the reward function, and then actively probes the user with preference queries to zero-in on their true reward. This algorithm not only enables us combine multiple data sources, but it also informs the robot when it should leverage each type of information. Further, our approach accounts for the human’s ability to provide data: yielding user-friendly preference queries which are also theoretically optimal. Our extensive simulated experiments and user studies on a Fetch mobile manipulator demonstrate the superiority and the usability of our integrated framework.
- Learning to Share Autonomy Across Repeated InteractionJonnavittula, Ananth; Losey, Dylan P. (Virginia Tech, 2021-07-20)Wheelchair-mounted robotic arms (and other assistive robots) should help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot’s motion: as the robot becomes confident it understands what the human wants, it increasingly intervenes to automate the task. But how does the robot know what tasks the human may want to perform in the first place? Today’s shared autonomy approaches often rely on prior knowledge: for example, the robot must know the set of possible human goals a priori. In the long-term, however, this prior knowledge will inevitably break down — sooner or later the human will reach for a goal that the robot did not expect. In this paper we propose a learning approach to shared autonomy that takes advantage of repeated interactions. Learning to assist humans would be impossible if they performed completely different tasks at every interaction: but our insight is that users living with physical disabilities repeat important tasks on a daily basis (e.g., opening the fridge, making coffee, and having dinner). We introduce an algorithm that exploits these repeated interactions to recognize the human’s task, replicate similar demonstrations, and return control when unsure. As the human repeatedly works with this robot, our approach continually learns to assist tasks that were never specified beforehand: these tasks include both discrete goals (e.g., reaching a cup) and continuous skills (e.g., opening a drawer). Across simulations and an in-person user study, we demonstrate that robots leveraging our approach match existing shared autonomy methods for known goals, and outperform imitation learning baselines on new tasks. See videos here: https://youtu.be/NazeLVbQ2og.
- LIMIT: Learning Interfaces to Maximize Information TransferChristie, Benjamin; Losey, Dylan P. (ACM, 2024-08)Robots can use auditory, visual, or haptic interfaces to convey information to human users. The way these interfaces select signals is typically pre-defined by the designer: for instance, a haptic wristband might vibrate when the robot is moving and squeeze when the robot stops. But different people interpret the same signals in different ways, so that what makes sense to one person is confusing to another. In this paper we introduce a unified algorithmic formalism for learning co-adaptive interfaces from scratch. Our insight is that interpretable interfaces should select signals that maximize correlation between the human's actions and the information the interface is trying to convey. Applying this insight we develop LIMIT: Learning Interfaces to Maximize Information Transfer. LIMIT optimizes a tractable, real-time proxy of information gain in continuous spaces. The first time a person works with our system the signals may appear random; but over repeated interactions the interface learns a one-to-one mapping between signals and human responses. Our resulting approach is both personalized to the current user and not tied to any specific interface modality. We compare LIMIT to state-of-the-art baselines across controlled simulations, an online survey, and an in-person user study with auditory, visual, and haptic interfaces.
- Physical interaction as communication: Learning robot objectives online from human correctionsLosey, Dylan P.; Bajcsy, Andrea; O'Malley, Marcia K.; Dragan, Anca D. (SAGE, 2021-10-25)When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state of the art treats these interactions as disturbances that the robot should reject or avoid. At best, these robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human–robot interaction (pHRI) is often intentional: the human intervenes on purpose because the robot is not doing the task correctly. In this article, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective: they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach with the state of the art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.
- SARI: Shared Autonomy across Repeated InteractionJonnavittula, Ananth; Mehta, Shaunak; Losey, Dylan P. (ACM, 2024)Assistive robot arms try to help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot’s motion: as the robot becomes conident it understands what the human wants, it intervenes to automate the task. But how does the robot know these tasks in the irst place? State-of-the-art approaches to shared autonomy often rely on prior knowledge. For instance, the robot may need to know the human’s potential goals beforehand. During long-term interaction these methods will inevitably break down Ð sooner or later the human will attempt to perform a task that the robot does not expect. Accordingly, in this paper we formulate an alternate approach to shared autonomy that learns assistance from scratch. Our insight is that operators repeat important tasks on a daily basis (e.g., opening the fridge, making cofee). Instead of relying on prior knowledge, we therefore take advantage of these repeated interactions to learn assistive policies. We introduce SARI, an algorithm that recognizes the human’s task, replicates similar demonstrations, and returns control when unsure. We then combine learning with control to demonstrate that the error of our approach is uniformly ultimately bounded. We perform simulations to support this error bound, compare our approach to imitation learning baselines, and explore its capacity to assist for an increasing number of tasks. Finally, we conduct three user studies with industry-standard methods and shared autonomy baselines, including a pilot test with a disabled user. Our results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks. See videos of our user studies here: https://youtu.be/3vE4omSvLvc
- Underactuated Exoskeletons for Lifting, Carrying, and Walking AssistanceFolta, Nathan Allen (Virginia Tech, 2023-07-24)Exoskeletons are rapidly emerging from the realm of science-fiction myth to practical reality in everyday life. Various designs have provided viable means for individuals to regain capabilities that were lost or perform tasks not previously possible by their ability alone. In this research, I propose two novel exoskeletons for walking assistance and heavy load carriage. The first exoskeleton can be used to provide assistance for walking in various applications such as industrial productivity, rehabilitation, and military or space training. We introduce a design for a lower body wearable device that supports up to 80% of the user's body weight (667 N peak force) with a single actuator on each leg. Its underactuated design directs force through the user's center of mass with a single sprocket-chain driven prismatic actuator on each leg, allowing for natural gait and mobility. The device is optimized for simplicity, ease of assembly, low cost, and weight. The second design aims to counteract the one of the leading causes of injury in the workplace, repetitive and heavy lifting. The Heavy Lift and Carry Exoskeleton (HeavyLC Exo) is capable of safely lifting and carrying loads up to 36 kg (80 lbs) while minimizing the number of actuators to reduce weight and complexity. The HeavyLC Exo allows the user to direct the object, pause and hold the object steady mid-lift, and follow the natural kinematics of lifting. It is secured to the user with shoulder, chest, and dual thigh straps, along with an adjustable waist belt and overshoe attachment. Powered by two 14.8 V batteries and an off-board air compressor, the HeavyLC Exo has a total of 20 DOF, with 6 actuated DOF and 14 free DOF. The arms use only two actuators each, providing powered lifting and arm retraction/extension, and allowing a wide range of body postures; the legs are powered by single pneumatic actuators on each leg connected to the foot accompanied by a passive spring element to prevent excessive pelvic tilt and leg abduction during swing. The control system requires directional forces from the user at the tool handle of 19 N (4.3 lbf) on average. Current design limitations necessitate the user to provide up to 280 N (62.9 lbf) at the hip during worst load conditions, and future design optimization is proposed. A fully functional prototype of HeavyLC Exo is built, fully tested, and analyzed for improvement.
- Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot InteractionMehta, Shaunak A.; Losey, Dylan P. (ACM, 2023)Humans can leverage physical interaction to teach robot arms. This physical interaction takes multiple forms depending on the task, the user, and what the robot has learned so far. State-of-the-art approaches focus on learning from a single modality, or combine multiple interaction types by assuming that the robot has prior information about the human?s intended task. By contrast, in this paper we introduce an algorithmic formalism that unites learning from demonstrations, corrections, and preferences. Our approach makes no assumptions about the tasks the human wants to teach the robot; instead, we learn a reward model from scratch by comparing the human?s inputs to nearby alternatives. We first derive a loss function that trains an ensemble of reward models to match the human?s demonstrations, corrections, and preferences. The type and order of feedback is up to the human teacher: we enable the robot to collect this feedback passively or actively. We then apply constrained optimization to convert our learned reward into a desired robot trajectory. Through simulations and a user study we demonstrate that our proposed approach more accurately learns manipulation tasks from physical human interaction than existing baselines, particularly when the robot is faced with new or unexpected objectives.