DeepMicro: deep representation learning for disease prediction based on microbiome data

dc.contributor.authorOh, Minen
dc.contributor.authorZhang, Liqingen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2020-08-06T17:33:06Zen
dc.date.available2020-08-06T17:33:06Zen
dc.date.issued2020en
dc.description.abstractHuman microbiota plays a key role in human health and growing evidence supports the potential use of microbiome as a predictor of various diseases. However, the high-dimensionality of microbiome data, often in the order of hundreds of thousands, yet low sample sizes, poses great challenge for machine learning-based prediction algorithms. This imbalance induces the data to be highly sparse, preventing from learning a better prediction model. Also, there has been little work on deep learning applications to microbiome data with a rigorous evaluation scheme. To address these challenges, we propose DeepMicro, a deep representation learning framework allowing for an effective representation of microbiome profiles. DeepMicro successfully transforms high-dimensional microbiome data into a robust low-dimensional representation using various autoencoders and applies machine learning classification algorithms on the learned representation. In disease prediction, DeepMicro outperforms the current best approaches based on the strain-level marker profile in five different datasets. In addition, by significantly reducing the dimensionality of the marker profile, DeepMicro accelerates the model training and hyperparameter optimization procedure with 8X–30X speedup over the basic approach. DeepMicro is freely available at https://github.com/minoh0201/DeepMicro.en
dc.description.sponsorshipThis work is partially supported by the funding from Data and Decisions Destination Area at Virginia Tech. Also, this publication is supported by Virginia Tech’s Open Access Subvention Fund.en
dc.identifier.doihttps://doi.org/10.1038/s41598-020-63159-5en
dc.identifier.urihttp://hdl.handle.net/10919/99570en
dc.identifier.volume10en
dc.language.isoen_USen
dc.publisherNature Researchen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.titleDeepMicro: deep representation learning for disease prediction based on microbiome dataen
dc.title.serialScientific Reportsen
dc.typeArticle - Refereeden

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s41598-020-63159-5.pdf
Size:
1.82 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: