On the Use of Grouped Covariate Regression in Oversaturated Models

dc.contributor.authorLoftus, Stephen Christopheren
dc.contributor.committeechairHouse, Leanna L.en
dc.contributor.committeememberKim, Inyoungen
dc.contributor.committeememberLeman, Scotland C.en
dc.contributor.committeememberBelden, Lisa K.en
dc.contributor.departmentStatisticsen
dc.date.accessioned2015-12-26T09:01:59Zen
dc.date.available2015-12-26T09:01:59Zen
dc.date.issued2015-12-11en
dc.description.abstractAs data collection techniques improve, oftentimes the number of covariates exceeds the number of observations. When this happens, regression models become oversaturated and, thus, inestimable. Many classical and Bayesian techniques have been designed to combat this difficulty, with various means of combating the oversaturation. However, these techniques can be tricky to implement well, difficult to interpret, and unstable. What is proposed is a technique that takes advantage of the natural clustering of variables that can often be found in biological and ecological datasets known as the omics datasests. Generally speaking, omics datasets attempt to classify host species structure or function by characterizing a group of biological molecules, such as genes (Genomics), the proteins (Proteomics), and metabolites (Metabolomics). By clustering the covariates and regressing on a single value for each cluster, the model becomes both estimable and stable. In addition, the technique can account for the variability within each cluster, allow for the inclusion of expert judgment, and provide a probability of inclusion for each cluster.en
dc.description.degreePh. D.en
dc.format.mediumETDen
dc.identifier.othervt_gsexam:6727en
dc.identifier.urihttp://hdl.handle.net/10919/64363en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectOversaturated modelen
dc.subjectBig dataen
dc.subjectVariable selectionen
dc.subjectData Analyticsen
dc.subjectBayesian methodsen
dc.titleOn the Use of Grouped Covariate Regression in Oversaturated Modelsen
dc.typeDissertationen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Loftus_SC_D_2015.pdf
Size:
2.42 MB
Format:
Adobe Portable Document Format