Advancing Machine Learning for Resilient and Safe Critical Systems: From Theory to Applications

Loading...
Thumbnail Image

TR Number

Date

2026-06-11

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Machine learning is increasingly used in critical systems such as power grids, cyber-physical infrastructure, and autonomous decision-making pipelines, where accuracy alone is insufficient and learned models must also satisfy safety, robustness, adaptability, and resilience requirements. This dissertation advances machine learning methods for resilient and safe critical systems through a combination of theoretical analysis, algorithmic development, and application-driven validation. We first study the problem of training neural networks that certifiably satisfy prescribed input-output specifications, such as reachability and Lipschitz robustness, by representing activation nonlinearities and specifications through quadratic constraints and applying a loop transformation to obtain a convex inner approximation of the admissible parameter set, which is solved via semidefinite programming; to enable large-scale certification, we further introduce a randomized sketching relaxation that re- duces memory and runtime by orders of magnitude while preserving tight bounds on benchmarks up to ImageNet. To improve sample efficiency, we analyze Sobolev training in the reproducing kernel Hilbert space framework, deriving learning rates and identifying how the benefit of gradient data depends on the smoothness of the target function and the available sample size. To improve sample efficiency, we analyze Sobolev training in the reproducing kernel Hilbert space framework, deriving learning rates and identifying how the benefit of gradient data depends on the smoothness of the target function and the available sample size. For cyber-physical resilience, we proposes a bilevel intrusion detection framework for distributed energy resource management systems that is robust to both training-time poisoning and deployment-time evasion attacks. For operational resilience we introduces a meta-guided gradient-free reinforcement learning framework for critical load restoration in power distribution systems, enabling rapid adaptation to unseen outage scenarios under renewable uncertainty while improving restoration speed and reliability. Finally, it studies robust transform discovery under stochastic reinforcement learning evaluations and proposes test-time training with delta-to-parent evaluation to improve large-language-model-guided program search under noisy sensor conditions. Together, these contributions provide a unified set of tools for designing machine learning systems that are not only predictive, but also certifiable, data-efficient, adversarially robust, adaptive, and deployable in high-stakes critical infrastructure environments.

Description

Keywords

Machine Learning, Reinforcement Learning, Meta-RL, Large Language Model, Power Systems

Citation