Show simple item record

dc.contributor.authorBadr, Eman Mostafaen_US
dc.date.accessioned2016-11-03T06:00:15Z
dc.date.available2016-11-03T06:00:15Z
dc.date.issued2015-05-12en_US
dc.identifier.othervt_gsexam:5908en_US
dc.identifier.urihttp://hdl.handle.net/10919/73366
dc.description.abstractSplicing regulatory elements (SREs) are short, degenerate sequences on pre-mRNA molecules that enhance or inhibit the splicing process via the binding of splicing factors, proteins that regulate the functioning of the spliceosome. Existing methods for identifying SREs in a genome are either experimental or computational. This work tackles the limitations in the current approaches for identifying SREs. It addresses two major computational problems, identifying variable length SREs utilizing a graph-based model with de Bruijn graphs and discovering co-occurring sets of SREs (combinatorial SREs) utilizing graph mining techniques. In addition, I studied and analyzed the effect of alternative splicing on tissue specificity in human. First, I have used a formalism based on de Bruijn graphs that combines genomic structure, word count enrichment analysis, and experimental evidence to identify SREs found in exons. In my approach, SREs are not restricted to a fixed length (i.e., k-mers, for a fixed k). Consequently, the predicted SREs are of different lengths. I identified 2001 putative exonic enhancers and 3080 putative exonic silencers for human genes, with lengths varying from 6 to 15 nucleotides. Many of the predicted SREs overlap with experimentally verified binding sites. My model provides a novel method to predict variable length putative regulatory elements computationally for further experimental investigation. Second, I developed CoSREM (Combinatorial SRE Miner), a graph mining algorithm for discovering combinatorial SREs. The goal is to identify sets of exonic splicing regulatory elements whether they are enhancers or silencers. Experimental evidence is incorporated through my graph-based model to increase the accuracy of the results. The identified SREs do not have a predefined length, and the algorithm is not limited to identifying only SRE pairs as are current approaches. I identified 37 SRE sets that include both enhancer and silencer elements in human genes. These results intersect with previous results, including some that are experimental. I also show that the SRE set GGGAGG and GAGGAC identified by CoSREM may play a role in exon skipping events in several tumor samples. Further, I report a genome-wide analysis to study alternative splicing on multiple human tissues, including brain, heart, liver, and muscle. I developed a pipeline to identify tissue-specific exons and hence tissue-specific SREs. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, I identified 28,100 tissue-specific exons across the four tissues. I identified 1929 exonic splicing enhancers with 99% overlap with previously published experimental and computational databases. A complicated enhancer regulatory network was revealed, where multiple enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the enhancers are found to be co-occurring with multiple silencers and vice versa, which demonstrates a complicated relationship between tissue-specific enhancers and silencers.en_US
dc.format.mediumETDen_US
dc.publisherVirginia Techen_US
dc.rightsThis Item is protected by copyright and/or related rights. Some uses of this Item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s).en_US
dc.subjectAlternative splicingen_US
dc.subjectde Bruijn graphsen_US
dc.subjectalgorithmsen_US
dc.subjectgraph miningen_US
dc.subjectsplicing regulatory elementsen_US
dc.titleIdentifying Splicing Regulatory Elements with de Bruijn Graphsen_US
dc.typeDissertationen_US
dc.contributor.departmentComputer Scienceen_US
dc.description.degreePh. D.en_US
thesis.degree.namePh. D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen_US
thesis.degree.disciplineComputer Science and Applicationsen_US
dc.contributor.committeechairHeath, Lenwood S.en_US
dc.contributor.committeememberGrene, Ruthen_US
dc.contributor.committeememberElHefnawi, Mahmoud M.en_US
dc.contributor.committeememberShaffer, Clifford A.en_US
dc.contributor.committeememberZhang, Liqingen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record