Spatial Allocation, Imputation, and Sampling Methods for Timber Product Output Data

TR Number
Date
2009-09-16
Journal Title
Journal ISSN
Volume Title
Publisher
Virginia Tech
Abstract

Data from the 2001 and 2003 timber product output (TPO) studies for Georgia were explored to determine new methods for handling missing data and finding suitable sampling estimators.

Mean roundwood volume receipts per mill for the year 2003 were calculated using the methods developed by Rubin (1987). Mean receipts per mill ranged from 4.4 to 14.2 million ft3. The mean value of 9.3 million ft3 did not statistically differ from the NONMISS, SINGLE1, and SINGLE2 references means (p=.68, .75, and .76 respectively).

Fourteen estimators were investigated to investigate sampling approaches, with estimators being of several means types (simple random sample, ratio, stratified sample, and combined ratio) as well as employing two methods for stratification (Dalenius-Hodges (DH) square root of the Frequency method and a cluster analysis method. Relative efficiency (RE) improved when the number of groups increased and when employing a ratio estimator, particularly a combined ratio. Neither the DH method nor the cluster analysis method performed better than the other.

Six bound sizes (1, 5, 10, 15, 20, and 25 percent) were considered for deriving samples sizes for the total volume of roundwood. The minimum achievable bound size was found to be 10 percent of the total receipts volume for the DH-method using a two group stratification. This was true for both the stratified and combined ratio estimators. In addition, for the stratified and combined ratio estimators, only the DH method stratifications were able to reach a 10 percent bound on the total (6 of the 12 stratified estimators). The remaining six stratified estimators were able to achieve a 20 percent bound of the total.

Finally, nonlinear repeated measures models were developed to spatially allocate mill receipts to surrounding counties in the event of obtaining only a mill's total receipt volume. A Gompertz model with a power spatial covariance was found to be the best performing when using road distances from the mills to either county center type (geographic or forest mass). These models utilized the cumulative frequency of mill receipts as the response variable, with cumulative frequencies based on distance from the mill to the county.

Description
Keywords
nonlinear repeated measures, spatial allocation, relative efficiency, multiple imputation, timber product output data
Citation