Bayesian Variable Selection and Inference for Nonparametric Kernel Machine and Functional Models
dc.contributor.author | Jin, Phil Geun | en |
dc.contributor.committeechair | Kim, Inyoung | en |
dc.contributor.committeemember | Franck, Christopher Thomas | en |
dc.contributor.committeemember | Datta, Jyotishka | en |
dc.contributor.committeemember | Deng, Xinwei | en |
dc.contributor.department | Statistics | en |
dc.date.accessioned | 2025-05-21T08:03:18Z | en |
dc.date.available | 2025-05-21T08:03:18Z | en |
dc.date.issued | 2025-05-20 | en |
dc.description.abstract | In this dissertation, we have focused on developing three methods to address the challenges in highly correlated high-dimensional and functional data. In the first study, the Bayesian variable selection method is developed under a generalized fused multi-kernel machine regression. This method can apply to continuous/binary/ordered categorical response variables. We demonstrate the advantage of our method using bio-photonics Raman spectroscopy to identify which molecular fingerprinting wavenumber is associated with drug dosages of brain tumors. In the second study, we propose a Bayesian inference based on the Bayes factor. Our approach employs a generalized fused multi-kernel machine regression to adjust for multiple tests and identify significant pathways. The advantage of this method is illustrated by using genetic pathway data to test significantly correlated multiple pathways associated with Type II diabetes, estimating nonlinear relationships. Finally, we introduce a testing procedure for the departure of nonlinearity using a functional single index model. This procedure employs a randomly projected empirical process to reduce dimensionality while preserving essential statistical properties. The method is applied to autism brain imaging data to test whether fMRI signals are related to the autism diagnostic observation schedule. Therefore, the proposed three methods advance the field of variable selection and inference by offering innovative solutions to problems associated with correlated high-dimensional and functional data with practical applications across various domains. | en |
dc.description.abstractgeneral | This study introduces new statistical methods to better understand and analyze complex data, which involves many variables and complex relationships. Our work focuses on improving how we identify important variables and test for various types of data. The first work introduces a new approach to handling complex datasets, such as those obtained from advanced imaging in medical research. This method helps us pinpoint significant variables in data where information is gathered from closely related sources, such as tracking drug effects on brain cancer cells. The second work presents a Bayesian statistics method to identify important functions in high-dimensional data. This approach is beneficial for studying genetic pathways, which are often interconnected. By refining how we detect these pathways, we provide an understanding of their role in diseases like type II diabetes. Lastly, we propose a new test for checking nonlinearity in functional data, which includes data collected over time or space, like brain activity measured by fMRI scans. Our method simplifies complex data while retaining key information, revealing patterns that traditional methods might overlook. This advancement offers a new way to understand brain activity in conditions like autism. Overall, our methods enhance how we analyze and interpret complex, high-dimensional data, improving our ability to identify important variables and relationships in various fields, including medical research and genetics. | en |
dc.description.degree | Doctor of Philosophy | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:44109 | en |
dc.identifier.uri | https://hdl.handle.net/10919/133541 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | Bayes Factor | en |
dc.subject | Functional Data | en |
dc.subject | Fused Lasso | en |
dc.subject | Generalized MultiKernel Regression | en |
dc.subject | Variable Selection | en |
dc.title | Bayesian Variable Selection and Inference for Nonparametric Kernel Machine and Functional Models | en |
dc.type | Dissertation | en |
thesis.degree.discipline | Statistics | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | doctoral | en |
thesis.degree.name | Doctor of Philosophy | en |