Modeling and Characterization of Dynamic Changes in Biological Systems from Multi-platform Genomic Data
MetadataShow full item record
Biological systems constantly evolve and adapt in response to changed environment and external stimuli at the molecular and genomic levels. Building statistical models that characterize such dynamic changes in biological systems is one of the key objectives in bioinformatics and computational biology. Recent advances in high-throughput genomic and molecular profiling technologies such as gene expression and and copy number microarrays provide ample opportunities to study cellular activities at the individual gene and network levels. The aim of this dissertation is to formulate mathematically dynamic changes in biological networks and DNA copy numbers, to develop machine learning algorithms to learn these statistical models from high-throughput biological data, and to demonstrate their applications in systems biological studies. The first part (Chapters 2-4) of the dissertation focuses on the dynamic changes taking placing at the biological network level. Biological networks are context-specific and dynamic in nature. Under different conditions, different regulatory components and mechanisms are activated and the topology of the underlying gene regulatory network changes. We report a differential dependency network (DDN) analysis to detect statistically significant topological changes in the transcriptional networks between two biological conditions. Further, we formalize and extend the DDN approach to an effective learning strategy to extract structural changes in graphical models using l1-regularization based convex optimization. We discuss the key properties of this formulation and introduce an efficient implementation by the block coordinate descent algorithm. Another type of dynamic changes in biological networks is the observation that a group of genes involved in certain biological functions or processes coordinate to response to outside stimuli, producing distinct time course patterns. We apply the echo stat network, a new architecture of recurrent neural networks, to model temporal gene expression patterns and analyze the theoretical properties of echo state networks with random matrix theory. The second part (Chapter 5) of the dissertation focuses on the changes at the DNA copy number level, especially in cancer cells. Somatic DNA copy number alterations (CNAs) are key genetic events in the development and progression of human cancers, and frequently contribute to tumorigenesis. We propose a statistically-principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately detect genomic deletion type, estimate normal tissue contamination, and accordingly recover the true copy number profile in cancer cells.
- Doctoral Dissertations