VTechWorks staff will be away for the winter holidays until January 5, 2026, and will respond to requests at that time.
 

Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome

TR Number

Date

2024-12-01

Journal Title

Journal ISSN

Volume Title

Publisher

Springernature

Abstract

Microbiome studies generate multivariate compositional responses, such as taxa counts, which are strictly non-negative, bounded, residing within a simplex, and subject to unit-sum constraint. In presence of covariates (which can be moderate to high dimensional), they are popularly modeled via the Dirichlet-Multinomial (D-M) regression framework. In this paper, we consider a Bayesian approach for estimation and inference under a D-M compositional framework, and present a comparative evaluation of some state-of-the-art continuous shrinkage priors for efficient variable selection to identify the most significant associations between available covariates, and taxonomic abundance. Specifically, we compare the performances of the horseshoe and horseshoe+ priors (with the benchmark Bayesian lasso), utilizing Hamiltonian Monte Carlo techniques for posterior sampling, and generating posterior credible intervals. Our simulation studies using synthetic data demonstrate excellent recovery and estimation accuracy of sparse parameter regime by the continuous shrinkage priors. We further illustrate our method via application to a motivating oral microbiome data generated from the NYC-Hanes study. RStan implementation of our method is made available at the GitHub link: (https://github.com/dattahub/compshrink).

Description

Keywords

Bayesian, Compositional data, Generalized Dirichlet, Dirichlet, Large p, Shrinkage prior, Sparse probability vectors, Stick-breaking, Horseshoe

Citation