Biologically-Interpretable Disease Classification Based on Gene Expression Data

Files

TR Number

Date

2005-05-13

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Classification of tissues and diseases based on gene expression data is a powerful application of DNA microarrays. Many popular classifiers like support vector machines, nearest-neighbour methods, and boosting have been applied successfully to this problem. However, it is difficult to determine from these classifiers which genes are responsible for the distinctions between the diseases. We propose a novel framework for classification of gene expression data based on notion of condition-specific clusters of co-expressed genes called xMotifs. Our xMotif-based classifier is biologically interpretable: we show how we can detect relationships between xMotifs and gene functional annotations. Our classifier achieves high-accuracy on leave-one-out cross-validation on both two-class and multi-class data. Our technique has the potential to be the method of choice for researchers interested in disease and tissue classification.

Description

Keywords

Classification, Biclustering, Gene Expression, Microarrays

Citation

Collections