Deep Learning for Enhancing Human and Environmental Health
| dc.contributor.author | Choi, Joung Min | en |
| dc.contributor.committeechair | Zhang, Liqing | en |
| dc.contributor.committeemember | Ramakrishnan, Narendran | en |
| dc.contributor.committeemember | Bhattacharya, Debswapna | en |
| dc.contributor.committeemember | Zhou, Dawei | en |
| dc.contributor.committeemember | Lourentzou, Ismini | en |
| dc.contributor.department | Computer Science and#38; Applications | en |
| dc.date.accessioned | 2026-04-22T08:00:21Z | en |
| dc.date.available | 2026-04-22T08:00:21Z | en |
| dc.date.issued | 2026-04-21 | en |
| dc.description.abstract | Ensuring human and environmental health is a growing global priority and a fundamental challenge at the intersection of computer science, biology, and medicine. Advances in high-throughput sequencing technologies have enabled comprehensive characterization of biological systems across multiple omics layers, offering unprecedented opportunities to support precision medicine and environmental risk prevention. These data have been widely used for disease understanding, patient stratification, and monitoring of microbial communities in both clinical and environmental settings. In recent years, deep learning has emerged as an approach for modeling nonlinear relationships from high-dimensional and noisy omics data, demonstrating improved performance over traditional machine learning methods across various tasks. However, its practical application remains fundamentally constrained by key challenges arising from omics data scarcity and heterogeneity, including (1) limited availability of labeled samples, (2) batch effects across datasets, (3) the prevalence of missing values, and (4) the need for efficient and robust learning under limited data conditions. This work proposes a series of deep learning frameworks to address these challenges and enhance the practical applicability of omics-based analysis. To mitigate the scarcity of labeled data and batch effects, BCtypeFinder and CancerSubminer are presented as cancer subtyping methods that leverage both labeled and unlabeled datasets while correcting batch effects, resulting in improved robustness and generalizability. To address missing data in longitudinal studies, DeepMicroGen is developed as a generative adversarial network-based imputation framework that captures temporal dependencies and accurately reconstructs incomplete observations, thereby improving downstream predictive performance. Furthermore, to enable efficient and robust learning under limited data conditions, ARGfore is proposed as a forecasting framework for predicting antibiotic resistance gene abundances from time-series omics data, achieving improved predictive performance with reduced computational cost. Collectively, the proposed methods help to advance the applicability of deep learning in omics research by addressing fundamental omics data-related challenges. This work contributes to more robust disease characterization and improved predictive modeling and forecasting, thereby supporting the broader goals of precision medicine and environmental risk prevention. | en |
| dc.description.abstractgeneral | Ensuring human and environmental health is a growing global priority and a fundamental challenge that brings together computer science, biology, and medicine. Recent advances in sequencing technologies allow researchers to measure many types of biological information at once—often referred to as "omics" data—providing new opportunities to better understand diseases, group patients based on their conditions, and monitor microbial communities in both clinical and environmental settings. In recent years, artificial intelligence, especially deep learning that can automatically learn complex patterns from large datasets, has shown strong potential for analyzing these data. However, its practical use remains limited by several challenges, including limited labeled data (data with known outcomes), differences between datasets collected in different settings (known as batch effects), missing data, and the need for models that perform well even when only small datasets are available. This work develops a series of deep learning methods to address these challenges and improve the use of omics data in real-world applications. To handle limited labeled data and differences between datasets, BCtypeFinder and CancerSubminer are designed to better identify cancer subtypes by combining labeled and unlabeled data while reducing technical differences between studies. To address missing data in studies that track changes over time, DeepMicroGen is developed to impute missing values by learning patterns across time points, enabling more complete and accurate analyses. In addition, ARGfore is designed to predict levels of antibiotic resistance genes from time-based environmental data, providing accurate predictions while using fewer computational resources. Overall, this work helps to advance the applicability of deep learning in omics research by addressing fundamental omics data-related challenges. These advances support a better understanding of diseases, more personalized treatment strategies, and improved disease prediction and forecasting, thereby contributing to the broader goals of precision medicine and environmental risk prevention. | en |
| dc.description.degree | Doctor of Philosophy | en |
| dc.format.medium | ETD | en |
| dc.identifier.other | vt_gsexam:46295 | en |
| dc.identifier.uri | https://hdl.handle.net/10919/143022 | en |
| dc.language.iso | en | en |
| dc.publisher | Virginia Tech | en |
| dc.rights | In Copyright | en |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
| dc.subject | cancer subtyping | en |
| dc.subject | data imputation | en |
| dc.subject | time-series forecasting | en |
| dc.subject | deep learning | en |
| dc.title | Deep Learning for Enhancing Human and Environmental Health | en |
| dc.type | Dissertation | en |
| thesis.degree.discipline | Computer Science & Applications | en |
| thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
| thesis.degree.level | doctoral | en |
| thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1