Methods for the spatial modeling and evalution of tree canopy cover

TR Number

Date

2022-05-24

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Tree canopy cover is an essential measure of forest health and productivity, which is widely studied due to its relevance to many disciplines. For example, declining tree canopy cover can be an indicator of forest health, insect infestation, or disease. This dissertation consists of three studies, focused on the spatial modeling and evaluation of tree canopy cover, drawing on recent developments and best practices in the fields of remote sensing, data collection, and statistical analysis.newlinenewline The first study evaluates how well harmonic regression variables derived at the pixel-level using a time-series of all available Landsat images predict values of tree canopy cover. Harmonic regression works to approximate the reflectance curve of a given band across time. Therefore the coefficients that result from the harmonic regression model estimate relate to the phenology of the area of each pixel. We use a time-series of all available cloud-free observations in each Landsat pixel for NDVI, SWIR1 and SWIR2 bands to obtain harmonic regression coefficients for each variable and then use those coefficients to estimate tree canopy cover at two discrete points in time. This study compares models estimated using these harmonic regression coefficients to those estimated using Landsat median composite imagery, and combined models. We show that (1) harmonic regression coefficients that use a single harmonic coefficient provided the best quality models, (2) harmonic regression coefficients from Landsat-derived NDVI, SWIR1, and SWIR2 bands improve the quality of tree canopy cover models when added to the full suite of median composite variables, (3) the harmonic regression constant for the NDVI time-series is an important variable across models, and (4) there is little to no additional information in the full suite of predictors compared to the harmonic regression coefficients alone based on the information criterion provided by principal components analysis. The second study presented evaluates the use of crowdsourcing with Amazon's Mechanical Turk platform to obtain photointerpretated tree canopy cover data. We collected multiple interpretations at each plot from both crowd and expert interpreters, and sampled these data using a Monte Carlo framework to estimate a classification model predicting the "reliability" of each crowd interpretation using expert interpretations as a benchmark, and identified the most important variables in estimating this reliability. The results show low agreement between crowd and expert groups, as well as between individual experts. We found that variables related to fatigue had the most bearing on the "reliability" of crowd interpretations followed by whether the interpreter used false color or natural color composite imagery during interpretation. Recommendations for further study and future implementations of crowdsourced photointerpretation are also provided. In the final study, we explored sampling methods for the purpose of model validation. We evaluated a method of stratified random sampling with optimal allocation using measures of prediction uncertainty derived from random forest regression models by comparing the accuracy and precision of estimates from samples drawn using this method to estimates from samples drawn using other common sampling protocols using three large, simulated datasets as case studies. We further tested the effect of reduced sample sizes on one of these datasets and demonstrated a method to report the accuracy of continuous models for domains that are either regionally constrained or numerically defined based on other variables or the modeled quantity itself. We show that stratified random sampling with optimal allocation provides the most precise estimates of the mean of the reference Y and the RMSE of the population. We also demonstrate that all sampling methods provide reasonably accurate estimates on average. Additionally we show that, as sample sizes are increased with each sampling method, the precision generally increases, eventually reaching a level of convergence where gains in estimate precision from adding additional samples would be marginal.

Description

Keywords

Time-series, crowdsourcing, sampling, ensemble methods

Citation