Multi-Agent Hierarchical Distributed NMPC With Learned Locomotion via Reinforcement Learning

Loading...
Thumbnail Image

TR Number

Date

2026-06-22

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

This thesis investigates the development of a distributed, hierarchical control architecture for the coordinated navigation of multi-agent robot teams. Due to the high-dimensional, nonlinear dynamics inherent in legged locomotion and the coupled nature of multi-robot spatial planning, centralized control approaches face significant computational bottlenecks. To mitigate these challenges, the proposed framework decomposes the navigation problem across two timescales. At the high level, a Distributed Nonlinear Model Predictive Controller (DNMPC) operating at 5,Hz calculates velocity commands utilizing the Alternating Direction Method of Multipliers (ADMM). Collision avoidance is addressed through a consensus projection on the shared ADMM variable for inter-agent separation, combined with smooth exponential barriers for static obstacles, providing empirically verified safe navigation. At the low level, an end-to-end deep reinforcement learning (RL) policy operates at 250,Hz to track the planner's velocity commands. Synthesized via Proximal Policy Optimization (PPO) with targeted domain randomization, the actor-critic policy directly maps proprioceptive observations to joint position targets. The mathematical formulation of the ADMM-NMPC planner and the underlying RL reward structures are detailed, and the integrated hierarchy is validated through MuJoCo simulations on various legged robots and experimental deployments on the Unitree A1 quadruped, demonstrating a scalable, real-time framework for robust multi-agent navigation on resource-constrained embedded hardware.

Description

Keywords

Multi-Agent Systems, Distributed NMPC, Reinforcement Learning, Legged Locomotion

Citation

Collections