Semiparametric Regression Methods with Covariate Measurement Error

TR Number
Date
2012-12-06
Journal Title
Journal ISSN
Volume Title
Publisher
Virginia Tech
Abstract

In public health, biomedical, epidemiological, and other applications, data collected are often measured with error. When mismeasured data is used in a regression analysis, not accounting for the measurement error can lead to incorrect inference about the relationships between the covariates and the response. We investigate measurement error in the covariates of two types of regression models.  For each we propose a fully Bayesian approach that treats the variable measured with error as a latent variable to be integrated over, and a semi-Bayesian approach which uses a first order Laplace approximation to marginalize the variable measured with error out of the likelihood.

The first model is the matched case-control study for analyzing clustered binary outcomes. We develop low-rank thin plate splines for the case where a variable measured with error has an unknown, nonlinear relationship with the response. In addition to the semi- and fully Bayesian approaches, we propose another using expectation-maximization to detect both parametric and nonparametric relationships between the covariates and the binary outcome. We assess the performance of each method via simulation terms of mean squared error and mean bias. We illustrate each method on a perturbed example of 1--4 matched case-control study.

The second regression model is the generalized linear model (GLM) with unknown link function. Usually, the link function is chosen by the user based on the distribution of the response variable, often to be the canonical link. However, when covariates are measured with error, incorrect inference as a result of the error can be compounded by incorrect choice of link function. We assess performance via simulation of the semi- and fully Bayesian methods in terms of mean squared error. We illustrate each method on the Framingham Heart Study dataset.

The simulation results for both regression models support that the fully Bayesian approach is at least as good as the semi-Bayesian approach for adjusting for measurement error, particularly when the distribution of the variable of measure with error and the distribution of the measurement error are misspecified.

Description
Keywords
Bayesian methods, error-in-covariates, generalized linear models, matched case-control studies, mixed models, semiparametric reg
Citation