Modeling species geographic distributions in aquatic ecosystems using a density-based clustering algorithm
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Distributional ecology is a branch of ecology which aims to reconstruct and predict the geographic range of free-living and symbiotic organisms in terrestrial and aquatic ecosystems. More recently, distributional ecology has been used to map disease transmission risk. The implementation of distributional ecology for disease transmission has, however, been erroneous in many cases. The inaccurate representation of disease distribution is detrimental to effective control and prevention. Furthermore, ecological niche modeling experiments are generally developed and tested using data from terrestrial organisms, neglecting aquatic organisms in case studies. Both disease and aquatic systems are often data limited, and current modeling methods are often insufficient. There is, therefore, a need to develop data-driven models that perform accurately even when only limited amounts of data are available or when there is little to no knowledge of the species' natural history to be modeled. Here, I propose a data-driven ecological niche modeling method that requires presence-only data (i.e., absence, pseudoabsence, or background records are not needed for model calibration). My method is expected to reconstruct environmental conditions where data-limited aquatic organisms are more likely to be present, based on a density-based clustering algorithm as a proxy of the realized niche (i.e., abiotic, and biotic environmental conditions occupied by the organism). Supported by ecological theories and methods, my central hypothesis is that because density-based clustering machine-learning modeling prevents extrapolation and interpolation, it can robustly reconstruct the realized niche of a data-limited aquatic organism. First, I assembled a comprehensive dataset of abiotic (temperature) and biotic (phytoplankton) environmental conditions and presence reports using Vibrio cholerae, a well-understood aquatic bacterium species in coastal waters globally (Chapter 2). Second, using V. cholerae as a model system, I developed detailed parameterizations of density-based clustering models to determine the parameter values with the best capacities to reconstruct and predict the species' distribution in global seawaters (Chapter 3). Finally, I compared the performance of density-based clustering modeling against traditional, correlative machine-learning ecological niche modeling methods (Chapter 4). Density-based clustering models, when assessed based on model fit and prediction, had comparable performance to traditional 'data-hungry' machine-learning correlative methods used in modern applications of ecological niche modeling. Modeling the environmental and geographic ranges of V. cholerae, an aquatic organism of free-living and parasitic ecologies, is a novel approach itself in distributional ecology. Ecological niche modeling applications to pathogens, such as V. cholerae, provide an opportunity to further the knowledge of directly-transmitted emerging diseases for which only limited data are available. Density-based clustering ecological niche modeling is termed here as Marble, honoring a previous, experimental version of this analytical approach, and is expected to provide new opportunities to understand how an ecological niche modeling method influences estimates of the distribution of data-limited organisms of complex ecology. These are lessons applicable to novel, rare, and cryptic aquatic organisms, such as emerging diseases, endangered fishes, and elusive aquatic species.