Show simple item record

dc.contributor.authorShi, Xuen_US
dc.date.accessioned2017-10-25T08:00:14Z
dc.date.available2017-10-25T08:00:14Z
dc.date.issued2017-10-24en_US
dc.identifier.othervt_gsexam:12985en_US
dc.identifier.urihttp://hdl.handle.net/10919/79772
dc.description.abstractThe rapid development of biotechnology has enabled researchers to collect high-throughput data for studying various biological processes at the genomic level, transcriptomic level, and proteomic level. Due to the large noise in the data and the high complexity of diseases (such as cancer), it is a challenging task for researchers to extract biologically meaningful information that can help reveal the underlying molecular mechanisms. The challenges call for more efforts in developing efficient and effective computational methods to analyze the data at different levels so as to understand the biological systems in different aspects. In this dissertation research, we have developed novel Bayesian approaches to infer alternative splicing mechanisms in biological systems using RNA sequencing data. Specifically, we focus on two research topics in this dissertation: isoform identification and phenotype-specific transcript assembly. For isoform identification, we develop a computational approach, SparseIso, to jointly model the existence and abundance of isoforms in a Bayesian framework. A spike-and-slab prior is incorporated into the model to enforce the sparsity of expressed isoforms. A Gibbs sampler is developed to sample the existence and abundance of isoforms iteratively. For transcript assembly, we develop a Bayesian approach, IntAPT, to assemble phenotype-specific transcripts from multiple RNA sequencing profiles. A two-layer Bayesian framework is used to model the existence of phenotype-specific transcripts and the transcript abundance in individual samples. Based on the hierarchical Bayesian model, a Gibbs sampling algorithm is developed to estimate the joint posterior distribution for phenotype-specific transcript assembly. The performances of our proposed methods are evaluated with simulation data, compared with existing methods and benchmarked with real cell line data. We then apply our methods on breast cancer data to identify biologically meaningful splicing mechanisms associated with breast cancer. For the further work, we will extend our methods for de novo transcript assembly to identify novel isoforms in biological systems; we will incorporate isoform-specific networks into our methods to better understand splicing mechanisms in biological systems.en_US
dc.format.mediumETDen_US
dc.publisherVirginia Techen_US
dc.rightsThis item is protected by copyright and/or related rights. Some uses of this item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s).en_US
dc.subjectTranscriptome Assemblyen_US
dc.subjectRNA-seq Data Analysisen_US
dc.subjectBayesian Inferenceen_US
dc.subjectGibbs Samplingen_US
dc.subjectMarkov Chain Monte Carlo (MCMC)en_US
dc.titleBayesian Modeling for Isoform Identification and Phenotype-specific Transcript Assemblyen_US
dc.typeDissertationen_US
dc.contributor.departmentElectrical Engineeringen_US
dc.description.degreePh. D.en_US
thesis.degree.namePh. D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen_US
thesis.degree.disciplineElectrical Engineeringen_US
dc.contributor.committeechairXuan, Jianhua Jasonen_US
dc.contributor.committeememberLu, Chang Tienen_US
dc.contributor.committeememberBaumann, William T.en_US
dc.contributor.committeememberWang, Yue J.en_US
dc.contributor.committeememberAbbott, Amos L.en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record