DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation

dc.contributor.authorChoi, Joung Minen
dc.contributor.authorJi, Mingen
dc.contributor.authorWatson, Layne T.en
dc.contributor.authorZhang, Liqingen
dc.date.accessioned2023-06-30T13:40:17Zen
dc.date.available2023-06-30T13:40:17Zen
dc.date.issued2023-05en
dc.description.abstractMotivation The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods.Results This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier.Availability and implementationDeepMicroGen is publicly available at .en
dc.description.notesThis work was supported in part by the U.S. National Science Foundation Awards 2004751 and a Pilot Program to Enhance NIH Funding within the COE at Virginia Tecen
dc.description.sponsorshipU.S. National Science Foundation [2004751]; COE at Virginia Techen
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1093/bioinformatics/btad286en
dc.identifier.eissn1367-4811en
dc.identifier.issn1367-4803en
dc.identifier.issue5en
dc.identifier.otherbtad286en
dc.identifier.pmid37099704en
dc.identifier.urihttp://hdl.handle.net/10919/115608en
dc.identifier.volume39en
dc.language.isoenen
dc.publisherOxford University Pressen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.subjecttime-seriesen
dc.titleDeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputationen
dc.title.serialBioinformaticsen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
btad286.pdf
Size:
1.81 MB
Format:
Adobe Portable Document Format
Description:
Published version