Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification

Sun, Yanshen

Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification

dc.contributor.author	Sun, Yanshen	en
dc.contributor.committeechair	Lu, Chang Tien	en
dc.contributor.committeemember	Ramakrishnan, Narendran	en
dc.contributor.committeemember	Fu, Kaiqun	en
dc.contributor.committeemember	Zhang, Liqing	en
dc.contributor.committeemember	Reddy, Chandan K.	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2025-07-02T08:00:19Z	en
dc.date.available	2025-07-02T08:00:19Z	en
dc.date.issued	2025-07-01	en
dc.description.abstract	In spatiotemporal data mining, building models that are robust and generalizable across complex, non-ideal conditions is crucial for real-world deployment. While many existing methods perform well on benchmark datasets, they often assume clean, stationary, and uniformly sampled data, limiting their effectiveness in practical scenarios. In diverse operational domains such as traffic systems, neurophysiological monitoring, and drilling operations, spatiotemporal data is often noisy, irregular, and subject to distribution shifts — exposing the brittleness of conventional forecasting and classification pipelines. This dissertation advances spatiotemporal modeling by addressing three critical challenges that frequently arise in real-world applications but fall outside the scope of traditional forecasting and classification: anomaly detection, domain adaptation, and causal discovery. It systematically examines these issues across three cross-disciplinary application domains and proposes targeted, scenario-specific solutions: textbf{(1) Anomaly Detection:} We develop spatiotemporal anomaly detection frameworks to identify and mitigate irregularities in both traffic forecasting and EEG signal classification tasks. textbf{(2) Domain Adaptation:} We design and evaluate domain adaptation strategies that enable robust cross-patient EEG classification, addressing inter-subject variability and enhancing generalization across diverse patient data. textbf{(3) Causal Discovery:} We integrate causal discovery techniques into drilling fluid loss prediction workflows to uncover latent causal relationships, thereby enhancing the extrapolation capabilities of fine-tuned time series foundation models in previously unseen scenarios. Anomaly detection techniques are applied across three distinct tasks: detecting abnormal traffic sensor measurements, predicting the impact of traffic incidents, and classifying EEG signals for depression diagnosis. In the context of traffic sensor anomaly detection, key challenges include (1) modeling spatiotemporal dependencies to capture irregular patterns, (2) distinguishing implicit anomalies from regular fluctuations, and (3) maintaining robustness in the absence of reliable reference data. To address these issues, we propose S-DKFN, an unsupervised model that fuses spatial and temporal features to uncover complex anomaly patterns. It incorporates dilated temporal convolutional networks (TCNs), an encoder-decoder structure for multiscale representation learning, and leverages Kalman filtering principles for model fusion to improve robustness and accuracy. Traffic incident impact (TII) prediction also presents significant modeling challenges, particularly due to the dynamic nature of real-world traffic networks. Prior studies have often (1) overlooked systematic quantification of spatiotemporal TII and suffered from a lack of open benchmark datasets, (2) struggled to adapt attention mechanisms to capture interactions over time-varying road networks, and (3) failed to identify task-relevant substructures in space and time. To overcome these limitations, we first provide a formal quantification of TII and release two curated open-source datasets. We then propose two novel models: the RAS-Transformer, designed to locate affected sub-graphs, and the IST-Transformer, which leverages importance-score-based adversarial training to focus attention on sensors most impacted by incidents. EEG signal classification for depression diagnosis poses its own set of difficulties. Existing methods typically (1) struggle to extract meaningful patterns from noisy and non-stationary EEG signals, (2) rely heavily on manual preprocessing and handcrafted features, and (3) fall short in capturing the spatial and temporal dependencies intrinsic to neural activity. In response, we propose a novel spatiotemporal deep learning model tailored for depression-related EEG analysis. The architecture integrates multiple trainable denoising modules within an end-to-end pipeline, reducing the need for manual intervention and enabling the automatic extraction of robust neural biomarkers. This design improves classification performance while enhancing interpretability and adaptability across subjects. Domain adaptation techniques improve the effectiveness of spatiotemporal neural networks in EEG-based depression and epilepsy detection. For EEG-based depression detection, prior research struggles with (1) the reliance on extensive manual feature engineering to handle noise, (2) inadequate modeling of the spatial and temporal dynamics of brain activity, and (3) difficulty in adapting models to unseen patients. To address these challenges, we propose LAK-DSGCN (Lightweight Adjusted Kalman-aided Dual-Stream Graph Convolutional Networks), a novel spatiotemporal framework that (1) decomposes EEG signals into separate spatial and temporal components, (2) processes them using a gated TCN for temporal feature extraction and a GCN for spatial representation, and (3) fuses the learned representations using a lightweight Adjusted Kalman filter. Additionally, we incorporate a normalization term designed for the Kalman filter to enhance the model's generalizability across different patients. For EEG-based epilepsy detection, existing approaches face three main limitations: (1) a reliance on high-quality, fixed-format EEG signals that do not account for real-world inconsistencies; (2) the inability to effectively handle irregular sampling rates, missing data, and noisy signals; and (3) a lack of robust feature-learning techniques to extract stable neural representations across patients. To address these issues, we introduce CPEDNet (Cross-Patient Epilepsy Diagnosis Network), which (1) employs a latent Neural Ordinary Differential Equation (NODE) module to enhance EEG signals by mitigating irregular sampling and missing data, (2) transforms EEG signals into brain network flow representations, capturing spatial-temporal dynamics, and (3) integrates a score-based self-supervised learning strategy to improve feature stability and cross-patient generalization. Causal discovery techniques are applied to the task of drilling fluid loss prediction, which presents several unique challenges: (1) Data scarcity -- Due to the high cost of drilling operations, available datasets are often limited in size, increasing the risk of overfitting in causal models. (2) Complex causal structure -- Identifying robust causal relationships among covariates is difficult, yet essential for enabling generalization to unseen counterfactual scenarios based on causal reasoning. (3) Covariate influence -- It is nontrivial to ensure that different types of covariates influence the predicted fluid loss distribution in a causally consistent manner. To address these challenges, a causal discovery plug-in module is proposed for integration with Time Series Foundation Models (TSFMs). Specifically, the design provides three major contributions. (1) Frozen TSFM backbone -- The pretrained TSFM's parameters are frozen during fine-tuning to preserve general spatiotemporal representations and mitigate overfitting on small drilling datasets. (2) Causal rule integration -- Causal discovery techniques are used to identify and incorporate structured relationships between specific covariate subsets and the target variable, guiding prediction under counterfactual conditions. (3) Contrastive pretraining -- The plug-in module is pretrained using contrastive learning to ensure it learns discriminative latent representations conditioned on varying covariate configurations. In summary, this dissertation advances the field of spatiotemporal data mining by addressing three core challenges—anomaly detection, domain adaptation, and causal discovery—that frequently hinder the robustness and generalization of existing models in real-world, cross-disciplinary scenarios. We evaluate our proposed models on multiple real-world datasets, demonstrating significant improvements over existing state-of-the-art approaches in all settings. Together, these contributions push beyond the boundaries of conventional forecasting and classification tasks, demonstrating how task-specific adaptations and causal reasoning can greatly expand the applicability of spatiotemporal models in challenging, real-world environments.	en
dc.description.abstractgeneral	In our everyday world, data often come from many places at once—think of traffic sensors spread across a city, medical monitors recording brain activity, or instruments tracking what happens deep underground during drilling. This kind of "spatiotemporal" data (information that varies over both space and time) is messy in practice: sensors can fail, measurements can be uneven, and conditions can change suddenly. Yet, making sense of these complex, real-world data streams is essential if we want to build systems that work reliably outside of carefully controlled lab settings. This dissertation tackles three main obstacles that commonly arise when working with messy spatiotemporal data but are usually overlooked by standard forecasting or classification methods: (1) Finding Unusual Patterns (Anomaly Detection): Real-life systems sometimes behave in unexpected ways—traffic jams caused by accidents, or sudden irregularities in brain signals during a mental health assessment. Detecting these anomalies early is crucial. We develop new strategies for flagging odd behavior in both traffic networks (for example, spotting when a normally free-flowing road suddenly starts backing up) and in brainwave recordings related to depression. Our methods learn to capture the natural rhythms and hidden connections in the data, so they can spot when something truly out of the ordinary happens, even if it isn't obvious at first glance. (2) Adapting Across Different Conditions (Domain Adaptation): A solution that works well for one person's brainwave recordings or one set of EEG devices may fail when applied to another person or another machine. To make models more flexible, we introduce lightweight approaches that automatically adjust to new conditions—whether it's a different patient's brain activity or slightly different sensor setups. In practice, this means fewer manual tweaks, more reliable performance when data come from sources that weren't part of our original tests, and, ultimately, tools that doctors or engineers can trust in a wider range of scenarios. (3) Discovering Cause-and-Effect Relationships (Causal Discovery): Beyond just predicting "what might happen next," it's often vital to know "why" something happens—especially in high-stakes operations like drilling. For instance, if fluid unexpectedly seeps into a well, knowing which factors truly cause that leak (rather than just correlating with it) can guide safer, more cost-effective decisions. We develop a plug‐in module that learns these hidden cause‐and‐effect links and works alongside advanced time‐series models. By keeping the core model's parameters fixed (to prevent over‐fitting on the limited drilling data we have) and carefully integrating causal rules, our approach can make more reliable predictions even when conditions change or when we want to ask "What if we adjust this valve?" in a counterfactual sense. Across all three of these themes, we test our ideas on real-world datasets—from urban traffic sensors and clinical EEG recordings to actual drilling records—and show clear improvements over existing methods. By focusing on anomaly detection, domain adaptation, and causal reasoning, this work aims to make spatiotemporal models not just accurate in neat, textbook examples, but robust and trustworthy in the complex, ever-changing environments where they are most needed.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:43966	en
dc.identifier.uri	https://hdl.handle.net/10919/135746	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Spatiotemporal Data Mining	en
dc.subject	Time Series Forecasting	en
dc.subject	Time Series Classification	en
dc.subject	Anomaly Detection	en
dc.subject	Domain Adaptation	en
dc.subject	Causal Discovery	en
dc.subject	Causal Inference	en
dc.title	Toward Robust and Generalizable Spatiotemporal Modeling for Tasks beyond Forecasting and Classification	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sun_Y_D_2025.pdf
Size:: 8.44 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations