Semiparametric Regression Methods with Covariate Measurement Error

Johnson, Nels Gordon

Semiparametric Regression Methods with Covariate Measurement Error

dc.contributor.author	Johnson, Nels Gordon	en
dc.contributor.committeechair	Kim, Inyoung	en
dc.contributor.committeemember	Leman, Scotland C.	en
dc.contributor.committeemember	Terrell, George R.	en
dc.contributor.committeemember	Du, Pang	en
dc.contributor.department	Statistics	en
dc.date.accessioned	2014-07-16T22:52:10Z	en
dc.date.available	2014-07-16T22:52:10Z	en
dc.date.issued	2012-12-06	en
dc.description.abstract	In public health, biomedical, epidemiological, and other applications, data collected are often measured with error. When mismeasured data is used in a regression analysis, not accounting for the measurement error can lead to incorrect inference about the relationships between the covariates and the response. We investigate measurement error in the covariates of two types of regression models. For each we propose a fully Bayesian approach that treats the variable measured with error as a latent variable to be integrated over, and a semi-Bayesian approach which uses a first order Laplace approximation to marginalize the variable measured with error out of the likelihood. The first model is the matched case-control study for analyzing clustered binary outcomes. We develop low-rank thin plate splines for the case where a variable measured with error has an unknown, nonlinear relationship with the response. In addition to the semi- and fully Bayesian approaches, we propose another using expectation-maximization to detect both parametric and nonparametric relationships between the covariates and the binary outcome. We assess the performance of each method via simulation terms of mean squared error and mean bias. We illustrate each method on a perturbed example of 1--4 matched case-control study. The second regression model is the generalized linear model (GLM) with unknown link function. Usually, the link function is chosen by the user based on the distribution of the response variable, often to be the canonical link. However, when covariates are measured with error, incorrect inference as a result of the error can be compounded by incorrect choice of link function. We assess performance via simulation of the semi- and fully Bayesian methods in terms of mean squared error. We illustrate each method on the Framingham Heart Study dataset. The simulation results for both regression models support that the fully Bayesian approach is at least as good as the semi-Bayesian approach for adjusting for measurement error, particularly when the distribution of the variable of measure with error and the distribution of the measurement error are misspecified.	en
dc.description.degree	Ph. D.	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:16	en
dc.identifier.uri	http://hdl.handle.net/10919/49551	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Bayesian methods	en
dc.subject	error-in-covariates	en
dc.subject	generalized linear models	en
dc.subject	matched case-control studies	en
dc.subject	mixed models	en
dc.subject	semiparametric reg	en
dc.title	Semiparametric Regression Methods with Covariate Measurement Error	en
dc.type	Dissertation	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Ph. D.	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Johnson_NG_D_2012.pdf
Size:: 422.76 KB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations