Statistical Evaluation of Deep Learning for Event Detection in Time Series: Quantifying Uncertainty, Efficiency, and Adaptation with Applications to Seismic Data
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Rapid developments in deep learning have led to their widespread use in domains that rely on time series, largely because of their strong performance and flexibility. Yet evaluation practices have not kept pace. Deep learning models are often assessed using a few performance metrics computed on benchmark datasets, which ignores important questions about how predictive performance varies with data availability, how uncertainty is communicated in both predictions and aggregate metrics, and how shifting data distributions impact model reliability.
Presented as three studies, this dissertation develops principled statistical approaches for deep learning model evaluation that addresses these challenges in the context of time-series-based, scientific problems. The first study introduces an evaluation framework for seismic deep learning models where I assess learning efficiency while mitigating data leakage and quantify benchmark uncertainty by attributing variation to both training stochasticity and data sampling through an expansive design of experiments. The second study compares meta-learning techniques across data regimes and analyzes how consistently they perform under data shift. As part of this study, I contribute SeisTask, a semi-synthetic benchmark dataset with controlled, physically meaningful sources of shift for future study on adaptive learning approaches. The third study provides an empirical comparison of meta-learning and hierarchical Bayesian modeling and highlights their theoretical connection. I compare these methods in terms of interpretability, performance under shift, and predictive uncertainty.
In combination, these studies offer statistically grounded evaluations of deep learning models for event detection in time series and show how uncertainty, data requirements, and distributional shift influence model behavior in physical science applications.