Browsing by Author "Arcak, Murat"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Imitation Learning with Stability and Safety GuaranteesYin, He; Seiler, Peter; Jin, Ming; Arcak, Murat (IEEE, 2022-01-01)A method is presented to learn neural network (NN) controllers with stability and safety guarantees through imitation learning (IL). Convex stability and safety conditions are derived for linear time-invariant systems with NN controllers by merging Lyapunov theory with local quadratic constraints to bound the activation functions in the NN. These conditions are incorporated in the IL process, which minimizes the IL loss, and maximizes the volume of the region of attraction associated with the NN controller simultaneously. An alternating direction method of multipliers based algorithm is proposed to solve the IL problem. The method is illustrated on a vehicle lateral control example.
- Recurrent Neural Network Controllers Synthesis with Stability Guarantees for Partially Observed SystemsGu, Fangda; Yin, He; El Ghaoui, Laurent; Arcak, Murat; Seiler, Peter; Jin, Ming (2022)Neural network controllers have become popular in control tasks thanks to their flexibility and expressivity. Stability is a crucial property for safety-critical dynamical systems, while stabilization of partially observed systems, in many cases, requires controllers to retain and process long-term memories of the past. We consider the important class of recurrent neural networks (RNN) as dynamic controllers for nonlinear uncertain partially-observed systems, and derive convex stability conditions based on integral quadratic constraints, S-lemma and sequential convexification. To ensure stability during the learning and control process, we propose a projected policy gradient method that iteratively enforces the stability conditions in the reparametrized space taking advantage of mild additional information on system dynamics. Numerical experiments show that our method learns stabilizing controllers while using fewer samples and achieving higher final performance compared with policy gradient.