Computational Modeling for Differential Analysis of RNA-seq and Methylation data

dc.contributor.authorWang, Xiaoen
dc.contributor.committeechairXuan, Jianhua Jasonen
dc.contributor.committeememberWang, Yue J.en
dc.contributor.committeememberAbbott, A. Lynnen
dc.contributor.committeememberLou, Wenjingen
dc.contributor.committeememberHa, Dong S.en
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2016-08-17T08:00:48Zen
dc.date.available2016-08-17T08:00:48Zen
dc.date.issued2016-08-16en
dc.description.abstractComputational systems biology is an inter-disciplinary field that aims to develop computational approaches for a system-level understanding of biological systems. Advances in high-throughput biotechnology offer broad scope and high resolution in multiple disciplines. However, it is still a major challenge to extract biologically meaningful information from the overwhelming amount of data generated from biological systems. Effective computational approaches are of pressing need to reveal the functional components. Thus, in this dissertation work, we aim to develop computational approaches for differential analysis of RNA-seq and methylation data to detect aberrant events associated with cancers. We develop a novel Bayesian approach, BayesIso, to identify differentially expressed isoforms from RNA-seq data. BayesIso features a joint model of the variability of RNA-seq data and the differential state of isoforms. BayesIso can not only account for the variability of RNA-seq data but also combines the differential states of isoforms as hidden variables for differential analysis. The differential states of isoforms are estimated jointly with other model parameters through a sampling process, providing an improved performance in detecting isoforms of less differentially expressed. We propose to develop a novel probabilistic approach, DM-BLD, in a Bayesian framework to identify differentially methylated genes. The DM-BLD approach features a hierarchical model, built upon Markov random field models, to capture both the local dependency of measured loci and the dependency of methylation change. A Gibbs sampling procedure is designed to estimate the posterior distribution of the methylation change of CpG sites. Then, the differential methylation score of a gene is calculated from the estimated methylation changes of the involved CpG sites and the significance of genes is assessed by permutation-based statistical tests. We have demonstrated the advantage of the proposed Bayesian approaches over conventional methods for differential analysis of RNA-seq data and methylation data. The joint estimation of the posterior distributions of the variables and model parameters using sampling procedure has demonstrated the advantage in detecting isoforms or methylated genes of less differential. The applications to breast cancer data shed light on understanding the molecular mechanisms underlying breast cancer recurrence, aiming to identify new molecular targets for breast cancer treatment.en
dc.description.degreePh. D.en
dc.format.mediumETDen
dc.identifier.othervt_gsexam:8757en
dc.identifier.urihttp://hdl.handle.net/10919/72271en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectDifferential Analysisen
dc.subjectBayesian Modelingen
dc.subjectMarkov Random Fielden
dc.subjectRNA-seq Data Analysisen
dc.subjectMarkov Chain Monte Carlo (MCMC)en
dc.titleComputational Modeling for Differential Analysis of RNA-seq and Methylation dataen
dc.typeDissertationen
thesis.degree.disciplineElectrical Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wang_X_D_2016.pdf
Size:
4.07 MB
Format:
Adobe Portable Document Format