High-dimensional Multimodal Bayesian Learning

Salem, Mohamed Mahmoud

High-dimensional Multimodal Bayesian Learning

dc.contributor.author	Salem, Mohamed Mahmoud	en
dc.contributor.committeechair	Kim, Inyoung	en
dc.contributor.committeemember	Franck, Christopher Thomas	en
dc.contributor.committeemember	Van Mullekom, Jennifer Huffman	en
dc.contributor.committeemember	Gramacy, Robert B.	en
dc.contributor.department	Statistics	en
dc.date.accessioned	2024-12-13T09:00:19Z	en
dc.date.available	2024-12-13T09:00:19Z	en
dc.date.issued	2024-12-12	en
dc.description.abstract	High-dimensional datasets are fast becoming a cornerstone across diverse domains, fueled by advancements in data-capturing technology like DNA sequencing, medical imaging techniques, and social media. This dissertation delves into the inherent opportunities and challenges posed by these types of datasets. We develop three Bayesian methods: (1) Multilevel Network Recovery for Genomics, (2) Network Recovery for Functional data, and (3) Bayesian Inference in Transformer-based Models. Chapter 2 in our work examines a two-tiered data structure; to simultaneously explore the variable selection and identify dependency structures among both higher and lower-level variables, we propose a multi-level nonparametric kernel machine approach, utilizing variational inference to jointly identify multi-level variables as well as build the network. Chapter 3 addresses the development of a simultaneous selection of functional domain subsets, selection of functional graphical nodes, and continuous response modeling given both scalar and functional covariates under semiparametric, nonadditive models, which allow us to capture unknown, possibly nonlinear, interaction terms among high dimensional functional variables. In Chapter 4, we extend our investigation of leveraging structure in high dimensional datasets to the relatively new transformer architecture; we introduce a new penalty structure to the Bayesian classification transformer, leveraging the multi-tiered structure of the transformer-based model. This allows for increased, likelihood-based regularization, which is needed given the high dimensional nature of our motivating dataset. This new regularization approach allows us to integrate Bayesian inference via variational approximations into our transformer-based model and improves the calibration of probability estimates.	en
dc.description.abstractgeneral	In today's data-driven landscape, high-dimensional datasets have emerged as a corner stone across diverse domains, fueled by advancements in technology like sensor networks, genomics, and social media platforms. This dissertation delves into the inherent opportunities and challenges posed by these datasets, emphasizing their potential for uncovering hidden patterns and correlations amidst their complexity. As high-dimensional datasets proliferate, researchers face significant challenges in effectively analyzing and interpreting them. This research focuses on leveraging Bayesian methods as a robust approach to address these challenges. Bayesian approaches offer unique advantages, particularly in handling small sample sizes and complex models. By providing robust uncertainty quantification and regularization techniques, Bayesian methods ensure reliable inference and model generalization, even in the face of sparse or noisy data. Furthermore, this work examines the strategic integration of structured information as a regularization technique. By exploiting patterns and dependencies within the data, structured regularization enhances the interpretability and resilience of statistical models across various domains. Whether the structure arises from spatial correlations, temporal dependencies, or coordinated actions among covariates, incorporating this information enriches the modeling process and improves the reliability of the results. By exploring these themes, this research contributes to advancing the understanding and application of high-dimensional data analysis. Through a thorough examination of Bayesian methods and structured regularization techniques, this dissertation aims to support researchers in effectively navigating and extracting meaningful insights from the complex landscape of high-dimensional datasets.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:42169	en
dc.identifier.uri	https://hdl.handle.net/10919/123788	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Gaussian Process	en
dc.subject	High Dimensional Data	en
dc.subject	Variable Selection	en
dc.subject	Variational Inference	en
dc.subject	Uncertainty Quantification	en
dc.title	High-dimensional Multimodal Bayesian Learning	en
dc.type	Dissertation	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Salem_MM_D_2024.pdf
Size:: 7.2 MB
Format:: Adobe Portable Document Format

Download

Name:: Salem_MM_D_2024_support_1.pdf
Size:: 16.76 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Doctoral Dissertations