Data-driven detection of subtype-specific differentially expressed genes

dc.contributor.authorChen, Luluen
dc.contributor.authorLu, Yingzhouen
dc.contributor.authorWu, Chiung-Tingen
dc.contributor.authorClarke, Roberten
dc.contributor.authorYu, Guoqiangen
dc.contributor.authorVan Eyk, Jennifer E.en
dc.contributor.authorHerrington, David M.en
dc.contributor.authorWang, Yueen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2021-06-04T11:46:12Zen
dc.date.available2021-06-04T11:46:12Zen
dc.date.issued2021-01-11en
dc.description.abstractAmong multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.en
dc.description.notesThis work was funded in part by the National Institutes of Health under Grants HL111362-05A1, HL133932, W81XWH-18-1-0723 (BC171885P1), and U01NS115658-01.en
dc.description.sponsorshipNational Institutes of HealthUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USA [HL111362-05A1, HL133932, W81XWH-18-1-0723, BC171885P1, U01NS115658-01]en
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1038/s41598-020-79704-1en
dc.identifier.issn2045-2322en
dc.identifier.issue1en
dc.identifier.other332en
dc.identifier.pmid33432005en
dc.identifier.urihttp://hdl.handle.net/10919/103604en
dc.identifier.volume11en
dc.language.isoenen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.titleData-driven detection of subtype-specific differentially expressed genesen
dc.title.serialScientific Reportsen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten
dc.type.dcmitypeStillImageen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s41598-020-79704-1.pdf
Size:
4.17 MB
Format:
Adobe Portable Document Format
Description:
Published version