Automatic Reconstruction of the Building Blocks of Molecular Interaction Networks
Rivera, Corban G.
MetadataShow full item record
High-throughput whole-genome biological assays are highly intricate and difficult to interpret. The molecular interaction networks generated from evaluation of those experiments suggest that cellular functions are carried out by modules of interacting molecules. Reverse-engineering the modular structure of cellular interaction networks has the promise of significantly easing their analysis. We hypothesize that: • cellular wiring diagrams can be decomposed into overlapping modules, where each module is a set of coherently-interacting molecules and • a cell responds to a stress or a stimulus by appropriately modulating the activities of a subset of these modules. Motivated by these hypotheses, we develop models and algorithms that can reverse-engineer molecular modules from large-scale functional genomic data. We address two major problems: 1. Given a wiring diagram and genome-wide gene expression data measured after the application of a stress or in a disease state, compute the active network of molecular interactions perturbed by the stress or the disease. 2. Given the active networks for multiple stresses, stimuli, or diseases, compute a set of network legos, which are molecular modules with the property that each active network can be expressed as an appropriate combination of a subset of modules. To address the first problem, we propose an approach that computes the most-perturbed subgraph of a curated pathway of molecular interactions in a disease state. Our method is based on a novel score for pathway perturbation that incorporates both differential gene expression and the interaction structure of the pathway. We apply our method to a compendium of cancer types. We show that the significance of the most perturbed sub-pathway is frequently larger than that of the entire pathway. We identify an association that suggests that IL-2 infusion may have a similar therapeutic effect in bladder cancer as it does in melanoma. We propose two models to address the second problem. First, we formulate a Boolean model for constructing network legos from a set of active networks. We reduce the problem of computing network legos to that of constructing closed biclusters in a binary matrix. Applying this method to a compendium of 13 stresses on human cells, we automatically detect that about four to six hours after treatment with chemicals cause endoplasmic reticulum stress, fibroblasts shut down the cell cycle far more aggressively than fibroblasts or HeLa cells do in response to other treatments. Our second model represents each active network as an additive combination of network legos. We formulate the problem as one of computing network legos that can be used to recover active networks in an optimal manner. We use existing methods for non-negative matrix approximation to solve this problem. We apply our method to a human cancer dataset including 190 samples from 18 cancers. We identify a network lego that associates integrins and matrix metalloproteinases in ovarian adenoma and other cancers and a network lego including the retinoblastoma pathway associated with multiple leukemias.
- Doctoral Dissertations