Modeling and Analysis of Regulatory Elements in Arabidopsis thaliana from Annotated Genomes and Gene Expression Data

dc.contributor.authorPati, Amritaen
dc.contributor.committeechairHeath, Lenwood S.en
dc.contributor.committeememberGrene, Ruthen
dc.contributor.committeememberMurali, T. M.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2014-03-14T21:42:06Zen
dc.date.adate2005-08-15en
dc.date.available2014-03-14T21:42:06Zen
dc.date.issued2005-07-13en
dc.date.rdate2005-08-15en
dc.date.sdate2005-08-02en
dc.description.abstractModeling of cis-elements in the upstream regions of genes is a challenging computational problem. A set of regulatory motifs present in the promoters of a set of genes can be modeled by a biclique. Combinations of cis-elements play a vital role in ascertaining that the correct co-action of transcription factors binding to the gene promoter, results in appropriate gene expression in response to various stimuli. Geometrical and spatial constraints in transcription factor binding also impose restrictions on order and separation of cis-elements. Not all regulatory elements that coexist are biologically significant. If the set of genes in which a set of regulatory elements co-occur, are tightly correlated with respect to gene expression data over a set of treatments, the regulatory element combination can be biologically directed. The system developed in this work, XcisClique, consists of a comprehensive infrastructure for annotated genome and gene expression data for Arabidopsis thaliana. XcisClique models cis-regulatory elements as regular expressions and detects maximal bicliques of genes and motifs, called itemsets. An itemset consists of a set of genes (called a geneset) and a set of motifs (called a motifset) such that every motif in the motifset occurs in the promoter of every gene in the geneset. XcisClique differs from existing tools of the same kind in that, it offers a common platform for the integration of sequence and gene expression data. Itemsets identified by XcisClique are not only evaluated for statistical over-representation in sequence data, but are also examined with respect to the expression patterns of the corresponding geneset. Thus, the results produced are biologically directed. XcisClique is also the only tool of its kind for Arabidopsis thaliana, and can also be used for other organisms in the presence of appropriate sequence, expression, and regulatory element data. The web-interface to a subset of functionalities, source code and supplemental material are available online at http://bioinformatics.cs.vt.edu/xcisclique.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-08022005-120858en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-08022005-120858/en
dc.identifier.urihttp://hdl.handle.net/10919/44132en
dc.publisherVirginia Techen
dc.relation.haspartThesis.pdfen
dc.relation.haspartXCISCLIQUE.tar.gzen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectMotifsen
dc.subjectRegulatory elementsen
dc.subjectGene Expressionen
dc.subjectMotif Combinationsen
dc.subjectItemsetsen
dc.subjectGenesen
dc.titleModeling and Analysis of Regulatory Elements in Arabidopsis thaliana from Annotated Genomes and Gene Expression Dataen
dc.typeThesisen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Thesis.pdf
Size:
3.31 MB
Format:
Adobe Portable Document Format
Name:
XCISCLIQUE.tar.gz
Size:
156.7 MB
Format:
Unknown data format

Collections