VTechWorks staff will be away for the Independence Day holiday from July 4-7. We will respond to email inquiries on Monday, July 8. Thank you for your patience.
 

Learning Statistical and Geometric Models from Microarray Gene Expression Data

dc.contributor.authorZhu, Yitanen
dc.contributor.committeechairWang, Yue J.en
dc.contributor.committeememberZaghloul, Amir I.en
dc.contributor.committeememberLu, Chang-Tienen
dc.contributor.committeememberWyatt, Christopher L.en
dc.contributor.committeememberXuan, Jianhua Jasonen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2014-03-14T20:16:05Zen
dc.date.adate2009-10-01en
dc.date.available2014-03-14T20:16:05Zen
dc.date.issued2009-09-02en
dc.date.rdate2009-10-01en
dc.date.sdate2009-09-09en
dc.description.abstractIn this dissertation, we propose and develop innovative data modeling and analysis methods for extracting meaningful and specific information about disease mechanisms from microarray gene expression data. To provide a high-level overview of gene expression data for easy and insightful understanding of data structure, we propose a novel statistical data clustering and visualization algorithm that is comprehensively effective for multiple clustering tasks and that overcomes some major limitations of existing clustering methods. The proposed clustering and visualization algorithm performs progressive, divisive hierarchical clustering and visualization, supported by hierarchical statistical modeling, supervised/unsupervised informative gene/feature selection, supervised/unsupervised data visualization, and user/prior knowledge guidance through human-data interactions, to discover cluster structure within complex, high-dimensional gene expression data. For the purpose of selecting suitable clustering algorithm(s) for gene expression data analysis, we design an objective and reliable clustering evaluation scheme to assess the performance of clustering algorithms by comparing their sample clustering outcome to phenotype categories. Using the proposed evaluation scheme, we compared the performance of our newly developed clustering algorithm with those of several benchmark clustering methods, and demonstrated the superior and stable performance of the proposed clustering algorithm. To identify the underlying active biological processes that jointly form the observed biological event, we propose a latent linear mixture model that quantitatively describes how the observed gene expressions are generated by a process of mixing the latent active biological processes. We prove a series of theorems to show the identifiability of the noise-free model. Based on relevant geometric concepts, convex analysis and optimization, gene clustering, and model stability analysis, we develop a robust blind source separation method that fits the model to the gene expression data and subsequently identify the underlying biological processes and their activity levels under different biological conditions. Based on the experimental results obtained on cancer, muscle regeneration, and muscular dystrophy gene expression data, we believe that the research work presented in this dissertation not only contributes to the engineering research areas of machine learning and pattern recognition, but also provides novel and effective solutions to potentially solve many biomedical research problems, for improving the understanding about disease mechanisms.en
dc.description.degreePh. D.en
dc.identifier.otheretd-09092009-214731en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-09092009-214731/en
dc.identifier.urihttp://hdl.handle.net/10919/28924en
dc.publisherVirginia Techen
dc.relation.haspartZhu_Y_D_2009.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectBlind Source Separationen
dc.subjectConvex Analysis and Optimizationen
dc.subjectGene Expressionsen
dc.subjectClustering Evaluationen
dc.subjectData Clustering and Visualizationen
dc.titleLearning Statistical and Geometric Models from Microarray Gene Expression Dataen
dc.typeDissertationen
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhu_Y_D_2009.pdf
Size:
2.79 MB
Format:
Adobe Portable Document Format