Coupling Physical and Machine Learning Models with High Resolution Information Transfer and Rapid Update Frameworks for Environmental Applications
Few current modeling tools are designed to predict short-term, high-risk runoff from critical source areas (CSAs) in watersheds which are significant sources of non point source (NPS) pollution. This study couples the Soil and Water Assessment Tool-Variable Source Area (SWAT-VSA) model with the Climate Forecast System Reanalysis (CFSR) model and the Global Forecast System (GFS) model short-term weather forecast, to develop a CSA prediction tool designed to assist producers, landowners, and planners in identifying high-risk areas generating storm runoff and pollution. Short-term predictions for streamflow, runoff probability, and soil moisture levels were estimated in the South Fork of the Shenandoah river watershed in Virginia. In order to allow land managers access to the CSA predictions a free and open source software based web was developed. The forecast system consists of three primary components; (1) the model, which preprocesses the necessary hydrologic forcings, runs the watershed model, and outputs spatially distributed VSA forecasts; (2) a data management structure, which converts high resolution rasters into overlay web map tiles; and (3) the user interface component, a web page that allows the user, to interact with the processed output. The resulting framework satisfied most design requirements with free and open source software and scored better than similar tools in usability metrics. One of the potential problems is that the CSA model, utilizing physically based modeling techniques requires significant computational time to execute and process. Thus, as an alternative, a deep learning (DL) model was developed and trained on the process based model output. The DL model resulted in a 9% increase in predictive power compared to the physically based model and a ten-fold decrease in run time. Additionally, DL interpretation methods applicable beyond this study are described including hidden layer visualization and equation extractions describing a quantifiable amount of variance in hidden layer values. Finally, a large-scale analysis of soil phosphorus (P) levels was conducted in the Chesapeake Bay watershed, a current location of several short-term forecast tools. Based on Bayesian inference methodologies, 31 years of soil P history at the county scale were estimated, with the associated uncertainty for each estimate. These data will assist in the planning and implantation of short term forecast tools with P management goals. The short term modeling and communication tools developed in this work contribute to filling a gap in scientific tools aimed at improving water quality through informing land manager's decisions.