The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies

Shi, Leming; Jones, Wendell D.; Jensen, Roderick V.; Harris, Stephen C.; Perkins, Roger G.; Goodsaid, Federico M.; Guo, Lei; Croner, Lisa J.; Boysen, Cecilie; Fang, Hong; Qian, Feng; Amur, Shashi; Bao, Wenjun; Barbacioru, Catalin C.; Bertholet, Vincent; Cao, Xiaoxi M.; Chu, Tzu-Ming; Collins, Patrick J.; Fan, Xiao-hui; Frueh, Felix W.; Fuscoe, James C.; Guo, Xu; Han, Jing; Herman, Damir; Hong, Huixiao; Kawasaki, Ernest S.; Li, Quan-Zhen; Luo, Yuling; Ma, Yunqing; Mei, Nan; Peterson, Ron L.; Puri, Raj K.; Shippy, Richard; Su, Zhenqiang; Sun, Yongming A.; Sun, Hongmei; Thorn, Brett; Turpaz, Yaron; Wang, Charles; Wang, Sue J.; Warrington, Janet A.; Willey, James C.; Wu, Jie; Xie, Qian; Zhang, Liang; Zhang, Lu; Zhong, Sheng; Wolfinger, Russell D.; Tong, Weida

The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies

dc.contributor.author	Shi, Leming	en
dc.contributor.author	Jones, Wendell D.	en
dc.contributor.author	Jensen, Roderick V.	en
dc.contributor.author	Harris, Stephen C.	en
dc.contributor.author	Perkins, Roger G.	en
dc.contributor.author	Goodsaid, Federico M.	en
dc.contributor.author	Guo, Lei	en
dc.contributor.author	Croner, Lisa J.	en
dc.contributor.author	Boysen, Cecilie	en
dc.contributor.author	Fang, Hong	en
dc.contributor.author	Qian, Feng	en
dc.contributor.author	Amur, Shashi	en
dc.contributor.author	Bao, Wenjun	en
dc.contributor.author	Barbacioru, Catalin C.	en
dc.contributor.author	Bertholet, Vincent	en
dc.contributor.author	Cao, Xiaoxi M.	en
dc.contributor.author	Chu, Tzu-Ming	en
dc.contributor.author	Collins, Patrick J.	en
dc.contributor.author	Fan, Xiao-hui	en
dc.contributor.author	Frueh, Felix W.	en
dc.contributor.author	Fuscoe, James C.	en
dc.contributor.author	Guo, Xu	en
dc.contributor.author	Han, Jing	en
dc.contributor.author	Herman, Damir	en
dc.contributor.author	Hong, Huixiao	en
dc.contributor.author	Kawasaki, Ernest S.	en
dc.contributor.author	Li, Quan-Zhen	en
dc.contributor.author	Luo, Yuling	en
dc.contributor.author	Ma, Yunqing	en
dc.contributor.author	Mei, Nan	en
dc.contributor.author	Peterson, Ron L.	en
dc.contributor.author	Puri, Raj K.	en
dc.contributor.author	Shippy, Richard	en
dc.contributor.author	Su, Zhenqiang	en
dc.contributor.author	Sun, Yongming A.	en
dc.contributor.author	Sun, Hongmei	en
dc.contributor.author	Thorn, Brett	en
dc.contributor.author	Turpaz, Yaron	en
dc.contributor.author	Wang, Charles	en
dc.contributor.author	Wang, Sue J.	en
dc.contributor.author	Warrington, Janet A.	en
dc.contributor.author	Willey, James C.	en
dc.contributor.author	Wu, Jie	en
dc.contributor.author	Xie, Qian	en
dc.contributor.author	Zhang, Liang	en
dc.contributor.author	Zhang, Lu	en
dc.contributor.author	Zhong, Sheng	en
dc.contributor.author	Wolfinger, Russell D.	en
dc.contributor.author	Tong, Weida	en
dc.date.accessioned	2012-08-24T11:54:47Z	en
dc.date.available	2012-08-24T11:54:47Z	en
dc.date.issued	2008-08-12	en
dc.date.updated	2012-08-24T11:54:47Z	en
dc.description.abstract	Background Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists. Results Using the data sets generated by the MicroArray Quality Control (MAQC) project, we investigated the impact on the reproducibility of DEG lists of a few widely used gene selection procedures. We present comprehensive results from inter-site comparisons using the same microarray platform, cross-platform comparisons using multiple microarray platforms, and comparisons between microarray results and those from TaqMan - the widely regarded "standard" gene expression platform. Our results demonstrate that (1) previously reported discordance between DEG lists could simply result from ranking and selecting DEGs solely by statistical significance (P) derived from widely used simple t-tests; (2) when fold change (FC) is used as the ranking criterion with a non-stringent P-value cutoff filtering, the DEG lists become much more reproducible, especially when fewer genes are selected as differentially expressed, as is the case in most microarray studies; and (3) the instability of short DEG lists solely based on P-value ranking is an expected mathematical consequence of the high variability of the t-values; the more stringent the P-value threshold, the less reproducible the DEG list is. These observations are also consistent with results from extensive simulation calculations. Conclusion We recommend the use of FC-ranking plus a non-stringent P cutoff as a straightforward and baseline practice in order to generate more reproducible DEG lists. Specifically, the P-value cutoff should not be stringent (too small) and FC should be as large as possible. Our results provide practical guidance to choose the appropriate FC and P-value cutoffs when selecting a given number of DEGs. The FC criterion enhances reproducibility, whereas the P criterion balances sensitivity and specificity.	en
dc.description.version	Published version	en
dc.format.mimetype	application/pdf	en
dc.identifier.citation	BMC Bioinformatics. 2008 Aug 12;9(Suppl 9):S10	en
dc.identifier.doi	https://doi.org/10.1186/1471-2105-9-S9-S10	en
dc.identifier.uri	http://hdl.handle.net/10919/18884	en
dc.language.iso	en	en
dc.rights	Creative Commons Attribution 4.0 International	en
dc.rights.holder	Leming Shi et al.; licensee BioMed Central Ltd.	en
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	en
dc.title	The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies	en
dc.title.serial	BMC Bioinformatics	en
dc.type	Article - Refereed	en
dc.type.dcmitype	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 1471-2105-9-S9-S10.pdf
Size:: 1.32 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.5 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Journal Articles, BioMed Central and SpringerOpen
Scholarly Works, Biological Sciences