Optimal weight settings in locally weighted regression: A guidance through cross-validation approach

TR Number

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Locally weighted regression is a powerful tool that allows the estimation of different sets of coefficients for each location in the underlying data, challenging the assumption of stationary regression coefficients across a study region. The accuracy of LWR largely depends on how a researcher establishes the relationship across locations, which is often constructed using a weight matrix or function. This paper explores the different kernel functions used to assign weights to observations, including Gaussian, bi-square, and tri-cubic, and how the choice of weight variables and window size affects the accuracy of the estimates. We guide this choice through the cross-validation approach and show that the bi-square function outperforms the choice of other kernel functions. Our findings demonstrate that an optimal window size for LWR models depends on the cross-validation (CV) approach employed. In our empirical application, the full-sample CV guides the choice of a higher window-size case, and CV by proxy guides the choice of a lower window size. Since the CV by Proxy approach focuses on the predictive ability of the model in the vicinity of one specific point (usually a policy point/site), we note that guiding a model choice through this approach makes more intuitive sense when the aim of the researcher is to predict the outcome in one specific site (policy or target point). To identify the optimal weight variables, while we suggest exploring various combinations of weight variables, we argue that an efficient alternative is to merge all continuous variables in the dataset into a single weight variable.

Description

Keywords

locally weighted regression, weight variables, cross-validation

Citation

Collections