Training Physics-Guided Neural Networks with Multiple Constraints: An Application in Lake Ecology Modeling
dc.contributor.author | Pradhan, Aanish Kaustubh | en |
dc.contributor.committeechair | Karpatne, Anuj | en |
dc.contributor.committeemember | Carey, Cayelan C. | en |
dc.contributor.committeemember | Wang, Xuan | en |
dc.contributor.committeemember | Hanson, Paul C. | en |
dc.contributor.department | Computer Science and#38; Applications | en |
dc.date.accessioned | 2025-05-24T08:01:09Z | en |
dc.date.available | 2025-05-24T08:01:09Z | en |
dc.date.issued | 2025-05-23 | en |
dc.description.abstract | Lakes and reservoirs are critical components of Earth's ecosystems but are increasingly threatened by climate change and human activity, underscoring the need for reliable tools for modeling and predicting lake ecology. While machine learning has shown potential in modeling such systems, sparse environmental data often limits the ability of machine learn- ing models to produce physically consistent predictions or generalize to novel conditions. As a result, many existing approaches rely on computationally intensive physics-biogeochemical simulations to supplement training data. Physics-Guided Neural Networks (PGNN) offer a promising alternative by embedding scientific knowledge directly into the model through physical constraints applied during training. However, training these models at scale remains challenging due to the trade-offs between satisfying physical laws and fitting the data, often leading to optimization pathologies. This thesis explores the challenge of designing, training and evaluating PGNNs with up to six constraints without relying on auxillary simulation data. We assemble a suite of physics-based constraints grounded in limnological principles and evaluate their impact on neural network predictions by assessing within-distribution and zero-shot performance. To navigate the challenge of training with multiple constraints, we explore the use of multitask learning methods to counteract gradient pathologies that arise when training PGNNs. Our results suggest that multitask learning approaches can improve in-distribution performance in certain architectures, but they do not enhance zero- shot performance compared to unconstrained models. Our findings highlight the inherent complexity of scaling PGNNs and emphasize the need for principled training methodologies in data-scarce modeling contexts. | en |
dc.description.abstractgeneral | Lakes and reservoirs play a vital role in supporting biodiversity, providing freshwater and regulating the environment. As these ecosystems face increasing stress from climate change and human activity, it is critical to develop reliable tools for modeling lake conditions such as oxygen levels, water temperature, and algae growth. While machine learning has been shown to be a promising approach, these methods rely on large amounts of data for training. However, data from environmental systems is often sparsely available which cause models to struggle with generating physically consistent predictions or generalizing to unseen situa- tions. While past approaches have leveraged physics-biogeochemical models to simulate the lake ecosystem and generate more data, this approach can be computationally expensive. This thesis explores the use of physics-guided neural networks (PGNN), a class of machine learning models that incorporate scientific knowledge to improve accuracy and realism which excel in data-scarce situations. However, training these models can be challenging as the model may struggle to balance between obeying the physical knowledge and fitting the data. To navigate this, we leverage multitask learning methods to assist the models during train- ing. Our results demonstrate that applying these methods, may show promise in training PGNNs at scale albeit only with certain model architectures. Furthermore, our results show that while PGNNs trained with multiple constraints may predict better on the data they are trained on, they fail to generalize to unseen data compared to models trained without physical knowledge. These findings highlight both the complexity of combining physics and machine learning at scale to support lake and reservoir ecosystem modeling. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:43605 | en |
dc.identifier.uri | https://hdl.handle.net/10919/134207 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International | en |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en |
dc.subject | Constrained optimization | en |
dc.subject | Ecosystem modeling | en |
dc.subject | Multitask Learning | en |
dc.subject | Physics-guided neural networks | en |
dc.title | Training Physics-Guided Neural Networks with Multiple Constraints: An Application in Lake Ecology Modeling | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Science & Applications | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |
Files
Original bundle
1 - 1 of 1