Forest aboveground biomass mapping and estimation across multiple spatial scales using model-based inference

Article history: Received 14 April 2016 Received in revised form 29 June 2016 Accepted 16 July 2016 Available online xxxx Remotely sensed data have beenwidely used in recent years for mapping and estimating biomass. However, the characterization of the uncertainty of mapped or estimated biomass in previous studies was either based on adhoc approaches (e.g., usingmodel fitting statistics such rootmean square errors derived frompurposive samples) or mostly limited to the analysis of mean biomass for the whole study area. This study proposed a novel uncertainty analysismethod that can characterize biomass uncertainty acrossmultiple spatial scales andmultiple spatial resolutions. The uncertainty analysis method built on model-based inference and can propagate errors from trees to field plots, individual pixels, and small areas or large regions that consist of multiple pixels (up to all pixels within a study area). We developed and tested this method over northern Minnesota forest areas of approximately 69,508 km via a unique combination of several datasets for biomass mapping and estimation: wall-to-wall airborne lidar data, national forest inventory (NFI) plots, and destructive measurements of tree aboveground biomass (AGB). We found that the pixel-level AGB prediction error is dominated by lidar-based AGB model residual errors when the spatial resolution is near 380 m or finer and by model parameter estimate errorswhen the spatial resolution is coarser.We also found that the relative error of AGB predicted from lidar can be reduced to approximately 11% (ormean 5.1Mg/ha;max 43.6Mg/ha) at one-hectare scale (or at 100m spatial resolution) over our study area. Because our uncertainty analysis method uses model-based inference and does not require probability samples of field plots, our methodology has potential applications worldwide, especially over tropics and developing countries where NFI systems are not well-established. © 2016 Elsevier Inc. All rights reserved.


Introduction
Estimating and monitoring changes in forest biomass over large spatial extents is crucial for advancing our understanding of the global carbon cycle and implementing effective carbon emission mitigation strategies such as the UNFCCC program to Reduce Emissions from Deforestation and forest Degradation (REDD+) (e.g., Woodwell et al., 1983;Houghton, 2005;Gibbs et al., 2007;Schimel et al., 2015;Goetz et al., 2015). Estimation of the mean forest biomass density and its uncertainty over any spatial domain (e.g., forest plots, remote sensing pixels, forest stands, state, country, or even globe) is a topic under the umbrella of statistical inference that relies on field measurements collected over forest plots alone or, more often in recent years, in tandem with remotely sensed data. The statistical rigor of biomass inference methods is critical  for ensuring the consistency and accuracy of Monitoring, Reporting, and Verification (MRV)-based estimates of reduced forest carbon emissions relative to the reference levels (Goetz et al., 2015).

Model-based (MB) inference for uncertainty analysis
Two mainstream statistical inference frameworks exist for biomass estimation and uncertainty analysis: 1) design-based (DB) approaches that use probability samples of field plots only (e.g., Tomppo et al., 2010) or in combination with remotely sensed data (e.g., Andersen et al., 2009;McRoberts, 2010;Magnussen et al., 2014;Chirici et al., 2016), and 2) model-based (MB) or model-dependent approaches that use field plot data regardless of their sampling design and either complete coverage or just a probability sample of remotely sensed data for a study area (e.g., Hansen et al., 1978, McRoberts, 2006, 2010Ståhl et al., 2011Ståhl et al., , 2016Gobakken et al., 2012;Corona et al., 2014;Chen et al., 2015). DB inference is in some sense the gold standard and should generally be used if all possible, primarily because the DB estimators are unbiased or nearly so (McRoberts, 2010), whereas there is no such guarantee for MB estimators. However, MB inference is the focus of this study because it is the only option when probability (random) field samples cannot be acquired for remote and inaccessible regions, a typical case in the tropics and developing countries (Goetz et al., 2015). MB inference has several additional advantages over DB inference for characterizing the error and uncertainty of remotely sensed biomass products. First, MB inference can produce estimates of uncertainty for every population unit (pixel or rasterized lidar grid cell) whereas DB inference can do so only for sample units. Second, the unit level uncertainties can be aggregated at different spatial resolutions or within any geographic area for which the model is assumed to be correctly specified. Therefore, MB methods can be used to infer biomass at any spatial resolution or over any geographic area that is common to different biomass products, if a comparison of biomass estimates is needed. Third, many field plot sample units are typically required within an estimation domain for DB inference, whereas MB inference is possible even when no sample units are available and as long as the model prediction error information can be derived within the estimation domain (Magnussen, 2015). Fourth, MB inference was also found to be less prone to field plot geolocation errors . Due to these advantages, we have witnessed a rapidly growing body of studies using MB inference for estimating biomass and other forest attributes over the last decade (e.g., McRoberts, 2006;Ståhl et al., 2011;Gobakken et al., 2012;Nelson et al., 2012;Magnussen et al., 2014;Corona et al., 2014;Chen et al., 2015;Saarela et al., 2015Saarela et al., , 2016, although the development of MB inferential theory and its use for survey sampling and spatial analysis can be traced back to earlier years (e.g., Matérn, 1960;Royall, 1970Royall, , 1971Särndal et al., 1978;Hansen et al., 1983).

Need for considering multiple model-related errors at different spatial scales
Despite the great potential of MB inference, several critical challenges remain for its rigorous use in forest biomass estimation. The statistical models for predicting biomass from auxiliary remotely sensed and/or GIS data are the foundation of MB inference. Models for predicting biomass for individual population units from remotely sensed predictors are subject to three sources of error: 1) model residual errors, 2) sampling-related model parameter estimate errors (because models are developed from samples), and 3) model predictor errors such as those related to georeferencing (e.g., Wang et al., 2011). The current predominant practice with MB inference, except for a few studies (e.g., McRoberts, 2006McRoberts, , 2010Wang et al., 2011;Lu et al., 2012;Chen et al., 2015), considered only parameter estimate errors. This is partially due to the high level of sophistication, both methodologically and computationally, to 1) incorporate residual errors which are often heteroscedastic and sometimes spatially autocorrelated , and 2) propagate the remotely sensed predictor errors (Wang et al., 2011;Chen et al., 2015) within a statistical inference framework. Ståhl et al. (2016) concluded that the remote sensing model residual errors have negligible impacts on the estimation of means of forest attributes at large scales when population sizes become indefinitely large. However, residual errors could dominate at smaller scales. For example, Chen et al. (2015) found that model residuals contributed as much as 80% of the total prediction errors at the scale of pixels equivalent to field plot sizes when airborne lidar data were used to predict biomass in a western African forest. Clearly, the relative importance of different error sources depends on the scale of estimation. Given that biomass information is needed at different scales for various applications (e.g., van der Werf et al., 2010;Hatanaka et al., 2011;George et al., 2012;Paul et al., 2015), it is crucial to develop a general, cross-scale, and comprehensive methodology for MB inference that can consider different major sources of error altogether.

Importance of incorporating errors related to allometric models of tree biomass
Another complication associated with construction of forest biomass prediction models arises from the ground "truth" biomass of individual trees that are used to calibrate the remote sensing-based biomass models. In the context of remote sensing model calibration, biomass data for individual trees are not based on the destructive measurements of trees' weight but rather on allometric model predictions derived from other easily measured variables such as tree diameter at breast height (DBH) and height as predictors (e.g., Jenkins et al., 2003;Chave et al., 2014). Thus, individual tree allometric biomass models are a cornerstone of all forms of biomass inference (Clark and Kellner, 2012). For example, Chen (2015) found that a switch of allometric models can cause R 2 to change between 0.51 and 0.92 for modeling biomass in the mixed conifer forests of the Lake Tahoe basin in California using the quadratic mean of laser point heights as a predictor variable. Just like the prediction of pixel-level biomass using remote sensing-based biomass models, errors are inherent in tree biomass predictions because both the allometric models and tree-level predictors (e.g., Gertner and Dzialowy, 1984) have errors. Multiple recent studies have reported the impacts of the tree-level prediction errors associated with allometric model residuals (McRoberts and Westfall, 2014;Breidenbach et al., 2014), allometric model parameters (Ståhl et al., 2014), and measurements of tree-level predictors (Ståhl et al., 2014;Berger et al., 2014) on large-scale forest biomass and stem volume (e.g., McRoberts et al., 2015) estimates using national forest inventory plot data and DB inference. However, such errors have only recently been incorporated in MB inference of biomass . One reason for the paucity of such studies is that the characterization of tree-level biomass prediction errors requires 1) development of sophisticated methods to recover the errors related to the model residual and parameters, if the relevant statistics are not available (e.g., Magnussen and Negrete, 2015), or 2) destructive measurements of tree biomass used for allometric model development, which used to be inaccessible (Chen, 2015).
Publicly accessible datasets of tree-level biomass measurements based on destructive sampling over large geographical extents recently became available (e.g., Chave et al., 2014) and provide an unprecedented opportunity to directly assess tree-level, allometric model biomass prediction errors and the subsequent remote sensing-based biomass model prediction errors. To address the aforementioned limitations in the conventional practice of MB inference, Chen et al. (2015) leveraged such datasets over the tropics to develop a comprehensive uncertainty analysis framework for considering the errors throughout the entire workflow of AGB mapping over a western African tropical forest. The process included allometric model development, measurements of tree-level predictors, tree-level biomass prediction, plot-level biomass estimation, plot-level remote sensing-based biomass model development, remote sensing feature extraction, and, eventually, pixel-level biomass predictions using airborne lidar data. The framework was well-suited for biomass uncertainty analysis at the pixel level, a scale equivalent to the size of forest plots used for model construction. However, it is advantageous to extend such frameworks beyond the native modeling scale to larger spatial scales, as proposed in this study. We expect that, with such research efforts, substantial benefits will accrue for many applications such as i) direct inference for forest biomass of the whole or any part of a study area (e.g., individual forest stands) that consists of multiple pixels and ii) upscaling biomass estimates from one spatial resolution to another for the synergistic use of multisource and multiscale remote sensing data in large scale biomass mapping and estimation (e.g., Saatchi et al., 2011;Baccini et al., 2012;De Sy et al., 2012).

Research objectives
The study aims to develop a novel model-based inference method that can rigorously and comprehensively characterize biomass uncertainty at multiple spatial scales. This method was applied in the mapping and estimation of large-area forest aboveground biomass in northern Minnesota using 1) wall-to-wall airborne lidar data, 2) national forest inventory (NFI) plot data, and 3) destructive tree AGB measurements. Our study is unique because of the use of destructive AGB measurements for allometric modeling and the novel model-based inference method for uncertainty analysis. It is also one of the very few studies that propose to use large-scale airborne lidar and NFI data for wall-to-wall mapping and estimating biomass. The biomass and uncertainty maps will improve other forest carbon map products that use satellite data (e.g., NBCD-National Biomass and Carbon Dataset, see Kellndorfer et al., 2012) or use airborne lidar data but with less comprehensive and rigorous methods for characterizing uncertainty over our study area.

Allometric modeling dataset
We used raw destructive measurements of tree AGB to develop new allometric models for estimating the AGB of trees measured in FIA plots (see next section). Destructive measurements are the most accurate estimates of tree AGB because they usually involve 1) felling trees, 2) measuring the fresh weight of different components of the harvested trees such as bole, leaves, and branches, and 3) measuring the weight of samples of these components after they are dried in ovens. We used two such datasets: 1) the legacy tree dataset for the tree species in the United States (Radtke et al., 2015), and 2) a similar dataset for the tree species in Canada (Ung et al., 2008). These datasets were built from various sources including published research articles, written reports, as well as theses and dissertations. Facilities and individuals affiliated with the U.S. and Canadian Forest Services, universities, private companies, and various other institutions were identified as potential points of contact for acquiring such data. From these datasets, we selected the tree measurements for the species in the state of Minnesota to develop allometric models, which hereinafter are called the allometric modeling dataset. This modeling dataset includes a total of 9815 trees from 35 tree species, which represent 98.3% of the trees measured in the FIA study plots. In other words, only a small amount (1.7%) of the trees for species in the Minnesota statewide FIA database is not represented in the allometric modeling dataset. For each tree in the database, we have measurements of AGB, DBH, and height. We also extracted species-level wood density from Miles and Smith (2009).

National forest inventory plot data
To calibrate lidar-based models for biomass mapping and estimation, we used biomass estimates from field data for plots established by the Forest Inventory and Analysis (FIA) program of the USDA Forest Service (FS) which conducts the NFI of the United States. FIA established field plot centers in permanent locations using a sampling design that is regarded as producing an equal probability sample (McRoberts et al., 2005). Each FIA plot consists of four 7.32-m radius circular subplots that are configured as a central subplot and three peripheral subplots with centers located at distances of 36.58 m and azimuths of 0, 120, and 240°from the center of the central subplot. Field crews observe species and measure DBH and height for all trees with DBH of at least 12.7 cm (McRoberts and Westfall, 2014). Tree AGB was estimated using the allometric models we developed (see Section 3.2). We primarily used the central subplots (hereinafter called plots) because of the spatial autocorrelation among subplots within the same plot. In the same years of lidar data acquisition, FIA surveyed a total of 1078 field plots, among which 598 are forest plots with a total of 4894 trees measured.

Airborne lidar data
The airborne lidar data over the study area were acquired for three major regions from northeast to southwest called Arrowhead (May 3-June 2, 2011), Central Lakes (April 5-28, 2012), and Metro (April 29-May 15, 2011) ( Fig. 1(a)). The Arrowhead and Central Lakes regions were flown by one company while the Metro region by another. Due to the large geographical extent, the airborne lidar data were collected using different sensors (e.g., Leica ALS 50, 60, 70, and Optech ALTM Gemini), but their flight settings are similar with average flight height above ground level of about 2000 m, field of view of 40°, and pulse repetition rate of about 100 kHz. The average point densities of the three datasets are 1.5 point/m 2 , 1.4 point/m 2 , and 1.4 point/m 2 for Arrowhead, Central Lakes, and Metro, respectively. Ground returns have been identified for all lidar datasets. Other types of returns (e.g., building, water, noise, vegetation) have also been partially identified by the vendor and/or geospatial technicians in the Minnesota Geospatial Information Office. The vertical accuracy (in RMSE or root mean square error) of the DTM (Digital Terrain Model) products is 15 cm or less over open terrain.

Large-scale airborne lidar data processing
Because our goal is to estimate forest biomass, we need to identify the vegetation laser returns from the lidar point cloud. As mentioned earlier, the lidar points have already been classified by the data provider and/or GIS technicians in the Minnesota Geospatial Information Office, yet, with varying accuracy. Ground returns have been well classified because the main goal of the statewide lidar acquisition was to generate a high quality DTM. For buildings, their footprints were generally correct in the classification, but points from walls and complex roof structures are often misclassified as vegetation or other ( Fig. 2(a)). In addition, some tall objects such as towers and powerlines were often misclassified as vegetation ( Fig. 2(b)); numerous laser points (e.g., high points from flying birds, cloud, power plant plumes or points from a particular swath) were simply unclassified.
We developed our own algorithms to classify point clouds. The general workflow of our approach is to first generate a 1-m DTM from the ground returns identified in the point cloud and then subtract each point's DTM elevation from its Z coordinate to derive the height above ground (hereinafter called height). All returns with heights above a specified threshold (e.g., 1 m) are considered to be non-terrain points. Based on the laser point's height, we first identified noisy points that come from flying birds, clouds, power plant plumes, etc. Then, we improved the existing classification of building points by including the points from building walls and roof and developed methods to identify high-rise man-made features such as towers (Fig. 2). We have not developed an effective method for identifying powerline points, for which we relied on manpower to visually edit their class in ArcGIS with the aid of aerial photos. The remaining non-ground returns were considered to be from vegetation and used for biomass mapping and estimation. The introduction of the specific algorithm details is beyond the scope of this manuscript, but the algorithms were designed to work on raster cells instead of individual laser points so that the automatic classification took only a few minutes to process a single tile. The misclassification of laser points from high noise or tall buildings or man-made structure as vegetation can lead to unrealistically large canopy height estimates for AGB prediction. Considering that the vegetation height of the Minnesota state rarely exceeds 35 m, we visually checked all the laser points that had heights of larger than 35 m above ground and, if needed, corrected their classes.

Existing classification
Our classification a b Fig. 2. Comparison of laser point classification from existing products and from our methods. Green: vegetation; red: man-made objects; yellow: unclassified; magenta: ground returns.
(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) After the vegetation points had been identified in the point cloud, we used the Toolbox for Lidar Data Filtering and Forest Studies (Tiffs) (Chen, 2007) to generate various lidar metrics that characterize the height distribution of the point cloud for each FIA plot, including: mean height, quadratic mean height, standard deviation height, percentages of points at height bins at 5 m intervals, and 10%, 20%,…, 100% percentile heights . The plot-level lidar metrics were used for developing lidar-based AGB models. Similarly, we also constructed wall-to-wall maps of such lidar metrics at 13 m by 13 m spatial resolution equivalent to the FIA subplot size to predict pixellevel AGB and then estimate the mean AGB for the entire study area. See Section 3.3 for details how the lidar metrics were used for biomass modeling, mapping, and uncertainty analysis.

Uncertainty of tree-level and plot-level AGB prediction
Our uncertainty analysis framework considers all the major sources of error in the whole process of biomass mapping and estimation and propagates errors from trees to plot, pixel, and finally region levels (Fig. 3).
Previous studies have shown that the choice of tree AGB allometric models produces the largest variations in the results of biomass estimation and modeling (Chave et al., 2004;Zhao et al., 2012). Therefore, it is critical that the allometric model is properly developed and its prediction errors are rigorously assessed. This study follows Chen et al. (2015) with some revisions to develop the allometric models for predicting tree-level AGB and estimating the corresponding uncertainty. To recap, the allometric model is: where E(B tree | x) and var(B tree | x) represent the expectation and variance of a variable; f tree is the allometric model with parameters β to predict tree AGB (i.e., B tree ); σ ε;tree 2 is the variance of model residuals for trees with attributes x; θ is a parameter related to model residuals; Eq.
(3) indicates that we acknowledge the heteroscedasticity of model residuals and model it explicitly as a proportional function of the biomass prediction. This is motivated by the observation that the residual variation associated with allometric biomass model predictions is greater for larger trees. The model can be fitted using Generalized Least Squares (GLS) methods  which involves multiple iterations of estimating parameter θ using the maximum likelihood method and estimating parameters β using weighted least square with some weighting function such as w i =(1/(θf tree (β, x i ))) 2 . We used the model form of f tree (β, x) = β 1 (ρD 2 H) β2 , where ρ, D, and H are wood density, DBH, and tree height, respectively.
Our goal was to develop non-specific allometric models that can be applied across different tree species. This strategy is usually adopted in the tropics because of the sheer number of tree species existing there. In contrast, allometric models specific to individual species or species groups (e.g., Jenkins et al., 2003) were often used over temperate and boreal forests. The problem with specific allometric models is that they were usually developed for small geographic areas and thus could have large extrapolation errors when applied out of the spatial domain for which they were developed. Such errors can be minimized if we develop non-specific models by pooling destructive measurements that span a larger geographical area and a broader range of AGB variations than the ones associated with measurements from individual species or groups.
A very important assumption in generic allometric models is that the selected tree attributes for biomass prediction can capture the majority of the AGB variation, even across different tree species. However, some recent studies also indicated that, in addition to the common predictors (i.e., DBH, height, and wood density), other tree attributes such as crown mass ratio (e.g., Ploton et al., 2016) can cause tree AGB variation. If these attributes are dependent on tree size and/or species, the generic allometric model will work well if the sample used for allometric model development is representative of the target population to which the model is applied. In other words, the tree size and species distributions need to be as similar as possible for the sample and target population. Obviously, this requirement is difficult to meet for large-scale studies.
To minimize the impacts of species composition differences on the prediction ability of our models, we assigned weighting factors for individual trees according to the frequency of their associated species in the target population during the step of weighted least squares fitting of the GLS method. The motivation for this new weighting scheme is to ensure that the model predicts most accurately for the most abundant species in the target population instead of for the most abundant species in the allometric modeling sample dataset. Note that tree species information is needed only for allometric model development, not for tree AGB prediction. When the allometric model is used to predict biomass for any tree, the biomass prediction has three sources of errors: σ tree 2 = σ ε , tree 2 +σ f, tree 2 + σ x , tree 2 , where σ ε, tree , σ f,tree and σ x , tree are the model prediction errors related to residual variance around predictions, parameter estimates, and predictor variables, respectively. These errors can be either explicitly calculated (for σ ε ,tree ) or approximated using first-order Taylor Series expansion (for σ f,tree and σ x,tree ) .
We do not know tree attribute measurement errors to estimate σ x , tree . However, the tolerance for measurement errors specified by the FIA protocols is that 95% of measurements are to be within 0.5% of the true DBH and 90% of measurements are to be within 10% of the true height (U.S. Forest Service, 2015, Appendix 7). In this study, we refer to the standard deviation of an estimate or measurements as absolute error while the standard deviation divided by the estimate or measurement mean as relative error. Assuming that both measurement errors follow a Gaussian distribution with mean 0, the relative error of DBH and height measurements are about 0.3% and 6.1% (McRoberts et al., in press). For wood density, we followed Chave et al. (2004) and assumed the relative error of measurements is 10%.
After individual tree AGB is predicted using the allometric models and measurements of tree attributes, the plot-level AGB density B plot is estimated as the sum of individual tree AGB B tree divided by the plot area s. The standard deviation of the estimated B plot (denoted as B plot ) for any FIA plot is shown next: where σ tree,i is the AGB prediction error for tree i in the plot. This formula assumes that the errors among trees on the same plot are independent, which is a reasonable assumption for estimating large-scale tree volume or biomass (McRoberts and Westfall, 2014).

Uncertainty of native-pixel-level AGB prediction
With the plot-level AGB and lidar metrics, we developed lidar-based biomass models using the similar GLS approach that was used in fitting tree AGB allometric models and can explicitly model the heteroscedastic residuals: where z are different lidar metrics; φ is the lidar-biomass model parameter vector; k is a parameter that characterizes the residual variation of model prediction. The caret (^) used in this model formulation indicates that plot-level AGB is not an observation without error but an estimate obtained from application of the allometric model. To consider the uncertainty of AGB predictions from the allometric model, the model will be fitted using a revised objective function for the maximum likelihood estimation of the parameter k, as shown in Eq. (A11) in the appendix of Chen et al. (2015). In case the tree AGB allometric model prediction errors are not considered, the lidar-based AGB model prediction error at the pixel level will be underestimated, e.g., by 5% in our previous study of tropical forests in Ghana (see Chen et al., 2015).
We will use the multiplicative power model form for lidar-based AGB modeling and follow Chen et al. (2015) to select predictor variables that statistically significantly improve the biomass predictions. Once the lidar-based AGB model is developed using FIA plot data and their corresponding lidar metrics, we can predict biomass (denoted as B pix ) for a pixel at a spatial resolution equivalent to plot size. Similar to the allometric models for predicting tree AGB, the predicted biomass at the pixel level (denoted as B pix ) using airborne lidar also has three sources of errors: , where σ ε , pix, σ f , pix, and σ z , pix are the lidar-based model prediction errors related to residual variance, parameter estimates, and lidar predictor variables, respectively. Note that in the original formulation of calculating σ z , pix (Eq. (18)in Chen et al., 2015), computationally intensive Monte-Carlo simulations are required to calculate the variance of σ z, pix (i.e., var(σ z ,pix )) due to errors in lidar metrics z. We found that var(σ z ,pix ) can be ignored because of its very small contribution (b0.001%) to the total pixel-level prediction error, which simplifies the calculation of σ z , pix 2 to (k 2 + 1)var(f plot (φ, z)) so that σ z , pix 2 can be efficiently estimated at very large spatial scales. We do not know the magnitude of lidar metric errors because they depend on the sensor types, flight conditions, and forest conditions, but the terrain is relatively flat in Minnesota and the light sensor configuration and flight conditions of different lidar campaigns are similar. So, we expected the lidar height metrics to have smaller errors than those in the tropics. However, for comparison purposes, we used the same scenario of 10% relative error in lidar height metrics as the one we used in tropical forests .

Uncertainty of area-level mean AGB
The methodology presented in the previous two sections (3.2 and 3.3) is a modification of the approach in Chen et al. (2015), which focused on the AGB estimation and uncertainty analysis at the level of pixels equivalent to the sizes of field plots (i.e., native pixel size). Here, we extended that methodology to characterize the mean and its uncertainty of biomass density of any area (denoted as μ area;MB ) that covers N pixels: var μ area;MB ¼ 1 where ξ ε , ξ f , and ξ z are the pixel-level random variables associated with 1) residual errors, 2) prediction errors caused by parameter estimate errors, and 3) prediction errors caused by lidar metric errors, respectively. Note that these three random variables are assumed independent. The complexity of uncertainty at the area level comes from the three covariance terms, which increase the computation from O(n) to O(n 2 ) time at the first glimpse, if a brute-force method is used. However, the computation can reduced as below: According to McRoberts (2006McRoberts ( , 2010, the first covariance term related to residuals is: σ ε;pix;i σ ε;pix; j ¼ σ ε;pix;1 σ ε;pix;2 … σ ε;pix;N Â Ã 1 ρ 12 : ρ 1N ρ 21 1 : ρ 2N : : : : ρ N1 ρ N2 : 1 2 6 6 4 3 7 7 5 σ ε;pix;1 σ ε;pix;2 ::: σ ε;pix;N where ρ i,j is the correlation coefficient between model residual variables at pixel locations i and j, which can be modeled using spatial variograms. McRoberts (2010) found that the covariance among residuals is negligible for predicting mean proportion forest over an area in Minnesota. Even though N is a very large number for large areas, most elements in the matrix are zero and thus it can be processed using sparse matrices.
The second covariance term related to model parameter estimate errors can be reformulated as: where Ståhl et al., 2011), m is the number of parameters in the lidar-based biomass models and its value is usually small (e.g., 2), and p and q are indices for individual model parameter φ.
Note that g k can be computed in linear time O(n) and thus the whole covariance term involves only m 2 *O(n) time.
For the third covariance term, its values depend on the covariance of lidar metric errors across different pixels. As discussed previously, we often have little knowledge about the magnitude of lidar metric errors, not to mention their covariances. For simplicity, we will start the analysis by assuming that the lidar metric errors are statistically independent, which reduces the computation time to O(n) (as shown in the equation below): It is important to note that N can be the number of 13-m pixels for the whole study area (in such case, N is a very large number) or any smaller area (e.g., a 26 m by 26 m area that covers only four 13-m pixels, in which case N = 4).

Comparison with design-based estimators for the whole region
To compare with MB inference, we also estimated the mean biomass and uncertainty for the whole study area using the FIA plots as probability samples and DB approaches including simple random sample (SRS) based inference and model-assisted (MA) inference (McRoberts, 2010). The SRS-based estimator of mean biomass density of the study area ( μ area;SRS ) is as follows: where n is equal to 1078, the number of FIA plots within our study area. We estimated area-level AGB density using the MA inference as follows: where B plot;i is estimated from field data and allometric models (Section 3.2) whereasB plot;i is the plot-level AGB predicted using Eq. (5) (Section 3.3), l is the number of parameters in the lidar-based AGB model, and c bias is a correction term of estimated bias.

Uncertainty of tree-and plot-level AGB prediction
The non-specific tree AGB allometric model developed from the destructive AGB measurements is: where ρ is in g/cm 3 , D is in cm, H is in m, and B tree is in kg. The graph of tree AGB observations versus predictions indicated no systematic lack of fit ( Fig. 4(a)). Eq. (11a) indicates that β 1 = 0.0849 and β 2 = 0.9443. The estimated covariance matrix for the model parameter estimates is: Eq. (11b) indicates that the relative AGB error caused by σ ε , tree is about 23.1%, which is smaller than the error of 37.8% in an African tropical forest we previously studied . The relative AGB prediction errors caused by σ f,tree (related to allometric model parameter estimate errors) and σ x , tree (related to tree attribute measurement errors) are 0.49% (=σ f ;tree =B tree ) and 25.88% (=σ x;tree =B tree ), respectively. When the three error sources are combined, the relative AGB prediction error increases to 34.8%, but it is smaller than the relative error (about 50%) in the tropics . The smaller errors are mainly related to the smaller number and different types of tree species in these boreal and temperate forests. For tree AGB prediction, σ ε , tree , σ f , tree , and σ x , tree contribute to 45.16% (= σ ε;tree 2 =σ tree 2 ), 0.03% (= σ f ;tree 2 =σ tree 2 ), and 54.81% (=σ x;tree 2 =σ tree 2 ) of the total error σ tree , respectively. Therefore, the tree AGB prediction errors were dominated by model prediction residuals (σ ε , tree ) and errors in tree attributes x (σ x , tree ), which is similar to the findings in the tropical forests . At the plot-level, the relative AGB prediction error (σ B plot /B plot ) reduces to 17.3% from 34.8% at the individual tree level. This is due to the negative relationship between the relative error and tree numbers within a plot (Chave et al., 2004;Chen et al., 2015). In contrast to the results from Chen et al. (2015), we found that, although at the tree level the AGB prediction error (34.8%) is much smaller than the error (~50%) in tropical forests, at the plot level the AGB prediction error (17.3%) here is only slightly smaller than the error (19.5%) in the tropical forests. This is mainly because of the much smaller size (168 m 2 ) of our plots in comparison to the plot size (1600 m 2 ) in Chen et al. (2015).

Uncertainty of native-pixel-level AGB prediction
The lidar-based biomass model developed from the 598 forest plots for predicting pixel-level AGB is: where h qm is the quadratic mean height of all laser points within a FIA subplot plot or a 13 m by 13 m pixel of approximately equivalent area. The graph of plot-level AGB estimates (B plot ) versus lidar-based model predictions indicated the single lidar metric h qm can capture the dynamic range of plot-level AGB variations well ( Fig. 4(b)) for the majority of the plots except one with AGB N 500 Mg/ha. We carefully checked the field and lidar data for this plot and did not find reasons to consider it as an outlier. Therefore, this plot was kept in our model fitting. We examined experimental variograms based on the model residuals but did not find obvious spatial autocorrelation. Eq. (12b) means that, at the level of plots or pixels of equivalent size (i.e., 13 m by 13 m), the relative AGB prediction error due to lidar-biomass model residuals is 55.8%, which is larger than 47.9% in tropical forest . The larger residual error is mainly due to the relatively smaller plot or pixel size in this study. Additionally, we found that the pixel-level relative AGB prediction error caused by σ f , pix and σ z , pix are 10.0% and 16.4%, respectively (σ f , pix /B pix = 10.0%, σ z , pix /B pix = 16.4%). When all of the three error sources (σ ε , pix , σ f , pix , and σ z , pix ) are considered, the total relative AGB prediction error increases to 59.7%, which is larger than the total relative error of 54.1% of tropical forest , again primarily because of the smaller plot size of this study. Variance components σ ε , pix , σ f , pix , and σ z , pix contribute to 87.2% (=σ ε;pix 2 =σ pix 2 ), 5.2% (=σ f ;pix 2 =σ pix 2 ), and 7.6% (=σ z;pix 2 =σ pix 2 ) of the total error σ pix , respectively. Compared to Chen et al. (2015), the 5.2% contribution of the model parameter related error σ f , pix is smaller than the one (10.6%) in tropics, mainly because here we used a larger number of FIA forest plots (n = 598) to develop the lidar-based AGB model while Chen et al. (2015) developed their model with only 36 field plots.

Uncertainty of pixel-level AGB over coarser spatial resolutions
One of the main advantages for our MD inference method is that it cannot only map the biomass uncertainty at the native pixel level (i.e., 13 m by 13 m) of lidar-based biomass modeling but also upscale the biomass mean and uncertainty to coarser spatial resolutions using  equations in Section 3.4. We aggregated biomass density every N by N pixels (N = 2, 3,…, 30) and, correspondingly, created maps of AGB and its uncertainty at coarse spatial resolutions of N*13 m (i.e., 26 m, 39 m,…, 390 m) for the whole study area. Fig. 5(a) showed that pixellevel relative AGB prediction error quickly decreased initially and then flattened gradually, especially when the cell size is larger than 100 m. In contrast to the relative error of 59.7% at 13 m, the average errors at 100 m, 200 m, and 300 m cell sizes are approximately 11.0%, 6.5%, and 5.1%, respectively.
We also analyzed the pixel-level error budget and examined the relative contributions of each error source (lidar-based AGB model residuals, parameters, and lidar height metrics as predictors) to the pixellevel AGB prediction errors (Fig. 5(b)). The errors are dominated by model residuals at the spatial resolution of about 380 m, after which the parameter estimate errors become the major source. Note that the relative error of pixel-level AGB prediction caused by model parameter estimates dropped considerably when the pixel size was increased from 13 m to 26 m; however, aggregation to coarser spatial resolutions did not reduce its relative error much further (Fig. 5(a)). Overall, the errors in model predictors (i.e., lidar height metrics) make a small (b8%) contribution to the total error in pixel-level AGB prediction across different cell size and the impacts become smaller when the spatial resolution is coarser. Fig. 6 shows some sample map products of mean AGB and uncertainty at the spatial resolutions of near 100 m, 1000 m, and 10,000 m.
The average nearest neighbor distance among our FIA plots is about 2700 m. As mentioned earlier, the variogram analysis of lidar-based model residuals showed no spatial autocorrelation. It is important to note that this does not necessarily imply that no spatial autocorrelation exists among model residuals for distances shorter than 2700 m. However, because we have no field plots with shorter average nearest a b Fig. 6. Example maps of AGB mean and uncertainty of the same area but at different spatial resolutions. neighbor distance, we made a simplification to assume ρ i,j = 0 when using Eq. (8c). In other words, if spatial autocorrelation does exist within shorter distance, the contributions from model residuals to pixellevel AGB prediction errors would be higher .

Uncertainty of mean AGB over the whole region
For the whole study area, the mean biomass density estimated using our MB inference framework (μ area;MB ) is 46.2 Mg/ha with a standard error of 2.5 Mg/ha. Therefore, the relative prediction error is only 5.4%. The contribution of the model parameter related error is close to 100%. This is consistent with Ståhl et al. (2016) in that the MB AGB prediction error of mean biomass density over large area is dominated by the parameter estimate errors. Using DB approaches in Section 3.5, we found that the estimates from SRS and MA estimators are 47.2 ± 2.2 Mg/ha and 45.4 ± 1.1 Mg/ha, respectively. Therefore, the mean from one estimator is within two standard deviations from the mean of another estimator. In other words, the MB estimate is consistent with the estimates of DB estimates.

Conclusions
The study is unique because of our combined use of a) large-scale (69,508 km 2 ) wall-to-wall, instead of samples of, airborne lidar data for accurately mapping biomass, b) national FIA plot data for consistently modeling biomass, c) destructive measurements of individual tree biomass for comprehensively characterizing tree-level AGB errors in the field data, and d) novel MB inference methods for rigorously and flexibly characterizing the uncertainty of biomass estimates at multiple spatial resolutions. The model-based inference lies in the core of our methodology because it propagates errors from trees to field plots, individual pixels, and areas that consist of multiple pixels (up to all pixels within a study area). Based on it, we produced regional-scale wall-towall maps of biomass at multiple spatial resolutions and characterized the biomass uncertainty at the pixel levels. These maps of biomass and its uncertainty can serve as baseline maps for studying forest carbon stocks and their change in our study area.
Although we focus on biomass in the United States in this study, our methodology and research findings have global applications and implications. First, the key issues raised in this study, including the importance of developing proper allometric models, the full accounting of model residual, parameter, and predictor errors (for both allometric models and lidar-based models), and the need for considering uncertainty in field AGB estimates, are applicable to almost every biomass estimation method. Second, the uncertainty analysis method advocated in this study (i.e., model-based inference) does not necessarily require field plots from a probability sample (such as the FIA plots we used) and thus can better accommodate reality in the tropics and developing countries where NFI systems are not well-established, in comparison to the uncertainty analysis methods using design-based inference. Third, our model-based uncertainty analysis framework is general and can be easily adapted to the use of other remotely sensed data such as optical imagery and radar data. The global application of our uncertainty analysis framework can help answer the critical question of how well state-of-the-art remote sensing technologies such as airborne lidar can estimate forest aboveground biomass density at different spatial scales and can suggest future research directions that should be prioritized for improving the remotely sensed biomass estimates.