New Algorithms for Mining Network Datasets: Applications to Phenotype and Pathway Modeling

Jin, Ying

New Algorithms for Mining Network Datasets: Applications to Phenotype and Pathway Modeling

dc.contributor.author	Jin, Ying	en
dc.contributor.committeechair	Ramakrishnan, Naren	en
dc.contributor.committeemember	Fox, Edward A.	en
dc.contributor.committeemember	Heath, Lenwood S.	en
dc.contributor.committeemember	Murali, T. M.	en
dc.contributor.committeemember	Helm, Richard F.	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2014-03-14T21:23:44Z	en
dc.date.adate	2010-01-22	en
dc.date.available	2014-03-14T21:23:44Z	en
dc.date.issued	2009-12-08	en
dc.date.rdate	2010-01-22	en
dc.date.sdate	2009-12-30	en
dc.description.abstract	Biological network data is plentiful with practically every experimental methodology giving 'network views' into cellular function and behavior. Bioinformatic screens that yield network data include, for example, genome-wide deletion screens, protein-protein interaction assays, RNA interference experiments, and methods to probe metabolic pathways. Efficient and comprehensive computational approaches are required to model these screens and gain insight into the nature of biological networks. This thesis presents three new algorithms to model and mine network datasets. First, we present an algorithm that models genome-wide perturbation screens by deriving relations between phenotypes and subsequently using these relations in a local manner to derive genephenotype relationships. We show how this algorithm outperforms all previously described algorithms for gene-phenotype modeling. We also present theoretical insight into the convergence and accuracy properties of this approach. Second, we define a new data mining problem–constrained minimal separator mining—and propose algorithms as well as applications to modeling gene perturbation screens by viewing the perturbed genes as a graph separator. Both of these data mining applications are evaluated on network datasets from S. cerevisiae and C. elegans. Finally, we present an approach to model the relationship between metabolic pathways and operon structure in prokaryotic genomes. In this approach, we present a new pattern class—biclusters over domains with supplied partial orders—and present algorithms for systematically detecting such biclusters. Together, our data mining algorithms provide a comprehensive arsenal of techniques for modeling gene perturbation screens and metabolic pathways.	en
dc.description.degree	Ph. D.	en
dc.identifier.other	etd-12302009-142944	en
dc.identifier.sourceurl	http://scholar.lib.vt.edu/theses/available/etd-12302009-142944/	en
dc.identifier.uri	http://hdl.handle.net/10919/40493	en
dc.publisher	Virginia Tech	en
dc.relation.haspart	Jin_Ying_D_2009.pdf	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	partial orders	en
dc.subject	biclusters	en
dc.subject	graph separators	en
dc.subject	relative importance methods	en
dc.subject	Biological networks	en
dc.title	New Algorithms for Mining Network Datasets: Applications to Phenotype and Pathway Modeling	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Ph. D.	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Jin_Ying_D_2009.pdf
Size:: 1.52 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations