Semiparametric Regression Methods with Covariate Measurement Error

dc.contributor.authorJohnson, Nels Gordonen
dc.contributor.committeechairKim, Inyoungen
dc.contributor.committeememberLeman, Scotland C.en
dc.contributor.committeememberTerrell, George R.en
dc.contributor.committeememberDu, Pangen
dc.contributor.departmentStatisticsen
dc.date.accessioned2014-07-16T22:52:10Zen
dc.date.available2014-07-16T22:52:10Zen
dc.date.issued2012-12-06en
dc.description.abstractIn public health, biomedical, epidemiological, and other applications, data collected are often measured with error. When mismeasured data is used in a regression analysis, not accounting for the measurement error can lead to incorrect inference about the relationships between the covariates and the response. We investigate measurement error in the covariates of two types of regression models.  For each we propose a fully Bayesian approach that treats the variable measured with error as a latent variable to be integrated over, and a semi-Bayesian approach which uses a first order Laplace approximation to marginalize the variable measured with error out of the likelihood. The first model is the matched case-control study for analyzing clustered binary outcomes. We develop low-rank thin plate splines for the case where a variable measured with error has an unknown, nonlinear relationship with the response. In addition to the semi- and fully Bayesian approaches, we propose another using expectation-maximization to detect both parametric and nonparametric relationships between the covariates and the binary outcome. We assess the performance of each method via simulation terms of mean squared error and mean bias. We illustrate each method on a perturbed example of 1--4 matched case-control study. The second regression model is the generalized linear model (GLM) with unknown link function. Usually, the link function is chosen by the user based on the distribution of the response variable, often to be the canonical link. However, when covariates are measured with error, incorrect inference as a result of the error can be compounded by incorrect choice of link function. We assess performance via simulation of the semi- and fully Bayesian methods in terms of mean squared error. We illustrate each method on the Framingham Heart Study dataset. The simulation results for both regression models support that the fully Bayesian approach is at least as good as the semi-Bayesian approach for adjusting for measurement error, particularly when the distribution of the variable of measure with error and the distribution of the measurement error are misspecified.en
dc.description.degreePh. D.en
dc.format.mediumETDen
dc.identifier.othervt_gsexam:16en
dc.identifier.urihttp://hdl.handle.net/10919/49551en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectBayesian methodsen
dc.subjecterror-in-covariatesen
dc.subjectgeneralized linear modelsen
dc.subjectmatched case-control studiesen
dc.subjectmixed modelsen
dc.subjectsemiparametric regen
dc.titleSemiparametric Regression Methods with Covariate Measurement Erroren
dc.typeDissertationen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Johnson_NG_D_2012.pdf
Size:
422.76 KB
Format:
Adobe Portable Document Format