Robust and Data-Driven Uncertainty Quantification Methods as Real-Time Decision Support in Data-Driven Models
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The growing complexity and data in modern engineering and physical systems require robust frameworks for real-time decision-making. Data-driven models trained on observational data enable faster predictions but face key challenges—data corruption, bias, limited interpretability, and uncertainty misrepresentation—which can compromise their reliability. Propagating uncertainties from sources like model parameters and input features is crucial in data-driven models to ensure trustworthy predictions and informed decisions. Uncertainty quantification (UQ) methods are broadly categorized into surrogate-based models, which approximate simulators for speed and efficiency, and probabilistic approaches, such as Bayesian models and Gaussian processes, that inherently capture uncertainty into predictions. For real-time UQ, leveraging recent data instead of historical records enables more accurate and efficient uncertainty characterization, making it inherently data-driven. In dynamical analysis, the Koopman operator represents nonlinear system dynamics as linear systems by lifting state functions, enabling data-driven estimation through its applied form. By analyzing its spectral properties—eigenvalues, eigenfunctions, and modes—the Koopman operator reveals key insights into system dynamics and simplifies control design. However, inherent measurement uncertainty poses challenges for efficient estimation with dynamic mode and extended dynamic mode decomposition algorithms. This dissertation develops a statistical framework to propagate measurement uncertainties in the elements of the Koopman operator. This dissertation also develops robust estimation of model parameters, considering observational data, which is often corrupted, in Gaussian process settings. The proposed approaches adapt to evolving data and process agnostic— in which reliance on predefined source distributions is avoided.