Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification

Sun, Yanshen

Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification

Files

Sun_Y_D_2025.pdf (8.44 MB)

Downloads:

Date

2025-07-01

Authors

Sun, Yanshen

Publisher

Virginia Tech

Abstract

In spatiotemporal data mining, building models that are robust and generalizable across complex, non-ideal conditions is crucial for real-world deployment. While many existing methods perform well on benchmark datasets, they often assume clean, stationary, and uniformly sampled data, limiting their effectiveness in practical scenarios. In diverse operational domains such as traffic systems, neurophysiological monitoring, and drilling operations, spatiotemporal data is often noisy, irregular, and subject to distribution shifts — exposing the brittleness of conventional forecasting and classification pipelines.

This dissertation advances spatiotemporal modeling by addressing three critical challenges that frequently arise in real-world applications but fall outside the scope of traditional forecasting and classification: anomaly detection, domain adaptation, and causal discovery. It systematically examines these issues across three cross-disciplinary application domains and proposes targeted, scenario-specific solutions: textbf{(1) Anomaly Detection:} We develop spatiotemporal anomaly detection frameworks to identify and mitigate irregularities in both traffic forecasting and EEG signal classification tasks. textbf{(2) Domain Adaptation:} We design and evaluate domain adaptation strategies that enable robust cross-patient EEG classification, addressing inter-subject variability and enhancing generalization across diverse patient data. textbf{(3) Causal Discovery:} We integrate causal discovery techniques into drilling fluid loss prediction workflows to uncover latent causal relationships, thereby enhancing the extrapolation capabilities of fine-tuned time series foundation models in previously unseen scenarios.

Anomaly detection techniques are applied across three distinct tasks: detecting abnormal traffic sensor measurements, predicting the impact of traffic incidents, and classifying EEG signals for depression diagnosis. In the context of traffic sensor anomaly detection, key challenges include (1) modeling spatiotemporal dependencies to capture irregular patterns, (2) distinguishing implicit anomalies from regular fluctuations, and (3) maintaining robustness in the absence of reliable reference data. To address these issues, we propose S-DKFN, an unsupervised model that fuses spatial and temporal features to uncover complex anomaly patterns. It incorporates dilated temporal convolutional networks (TCNs), an encoder-decoder structure for multiscale representation learning, and leverages Kalman filtering principles for model fusion to improve robustness and accuracy.

Traffic incident impact (TII) prediction also presents significant modeling challenges, particularly due to the dynamic nature of real-world traffic networks. Prior studies have often (1) overlooked systematic quantification of spatiotemporal TII and suffered from a lack of open benchmark datasets, (2) struggled to adapt attention mechanisms to capture interactions over time-varying road networks, and (3) failed to identify task-relevant substructures in space and time. To overcome these limitations, we first provide a formal quantification of TII and release two curated open-source datasets. We then propose two novel models: the RAS-Transformer, designed to locate affected sub-graphs, and the IST-Transformer, which leverages importance-score-based adversarial training to focus attention on sensors most impacted by incidents.

EEG signal classification for depression diagnosis poses its own set of difficulties. Existing methods typically (1) struggle to extract meaningful patterns from noisy and non-stationary EEG signals, (2) rely heavily on manual preprocessing and handcrafted features, and (3) fall short in capturing the spatial and temporal dependencies intrinsic to neural activity. In response, we propose a novel spatiotemporal deep learning model tailored for depression-related EEG analysis. The architecture integrates multiple trainable denoising modules within an end-to-end pipeline, reducing the need for manual intervention and enabling the automatic extraction of robust neural biomarkers. This design improves classification performance while enhancing interpretability and adaptability across subjects.

Domain adaptation techniques improve the effectiveness of spatiotemporal neural networks in EEG-based depression and epilepsy detection. For EEG-based depression detection, prior research struggles with (1) the reliance on extensive manual feature engineering to handle noise, (2) inadequate modeling of the spatial and temporal dynamics of brain activity, and (3) difficulty in adapting models to unseen patients. To address these challenges, we propose LAK-DSGCN (Lightweight Adjusted Kalman-aided Dual-Stream Graph Convolutional Networks), a novel spatiotemporal framework that (1) decomposes EEG signals into separate spatial and temporal components, (2) processes them using a gated TCN for temporal feature extraction and a GCN for spatial representation, and (3) fuses the learned representations using a lightweight Adjusted Kalman filter. Additionally, we incorporate a normalization term designed for the Kalman filter to enhance the model's generalizability across different patients.

For EEG-based epilepsy detection, existing approaches face three main limitations: (1) a reliance on high-quality, fixed-format EEG signals that do not account for real-world inconsistencies; (2) the inability to effectively handle irregular sampling rates, missing data, and noisy signals; and (3) a lack of robust feature-learning techniques to extract stable neural representations across patients. To address these issues, we introduce CPEDNet (Cross-Patient Epilepsy Diagnosis Network), which (1) employs a latent Neural Ordinary Differential Equation (NODE) module to enhance EEG signals by mitigating irregular sampling and missing data, (2) transforms EEG signals into brain network flow representations, capturing spatial-temporal dynamics, and (3) integrates a score-based self-supervised learning strategy to improve feature stability and cross-patient generalization.

Causal discovery techniques are applied to the task of drilling fluid loss prediction, which presents several unique challenges: (1) Data scarcity -- Due to the high cost of drilling operations, available datasets are often limited in size, increasing the risk of overfitting in causal models. (2) Complex causal structure -- Identifying robust causal relationships among covariates is difficult, yet essential for enabling generalization to unseen counterfactual scenarios based on causal reasoning. (3) Covariate influence -- It is nontrivial to ensure that different types of covariates influence the predicted fluid loss distribution in a causally consistent manner. To address these challenges, a causal discovery plug-in module is proposed for integration with Time Series Foundation Models (TSFMs). Specifically, the design provides three major contributions. (1) Frozen TSFM backbone -- The pretrained TSFM's parameters are frozen during fine-tuning to preserve general spatiotemporal representations and mitigate overfitting on small drilling datasets. (2) Causal rule integration -- Causal discovery techniques are used to identify and incorporate structured relationships between specific covariate subsets and the target variable, guiding prediction under counterfactual conditions. (3) Contrastive pretraining -- The plug-in module is pretrained using contrastive learning to ensure it learns discriminative latent representations conditioned on varying covariate configurations.

In summary, this dissertation advances the field of spatiotemporal data mining by addressing three core challenges—anomaly detection, domain adaptation, and causal discovery—that frequently hinder the robustness and generalization of existing models in real-world, cross-disciplinary scenarios. We evaluate our proposed models on multiple real-world datasets, demonstrating significant improvements over existing state-of-the-art approaches in all settings. Together, these contributions push beyond the boundaries of conventional forecasting and classification tasks, demonstrating how task-specific adaptations and causal reasoning can greatly expand the applicability of spatiotemporal models in challenging, real-world environments.

Keywords

Spatiotemporal Data Mining, Time Series Forecasting, Time Series Classification, Anomaly Detection, Domain Adaptation, Causal Discovery, Causal Inference

Persistent link

https://hdl.handle.net/10919/135746

Collections

Doctoral Dissertations

Full item page

Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections