Scholarly Works, Mathematics
Permanent URI for this collection
Research articles, presentations, and other scholarship
Browse
Browsing Scholarly Works, Mathematics by Department "Computer Science"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
- JigCell Run Manager (JC-RM): a tool for managing large sets of biochemical model parametrizationsPalmisano, Alida; Hoops, Stefan; Watson, Layne T.; Jones, Thomas C.; Tyson, John J.; Shaffer, Clifford A. (Biomed Central, 2015-12-24)Background Most biomolecular reaction modeling tools allow users to build models with a single list of parameter values. However, a common scenario involves different parameterizations of the model to account for the results of related experiments, for example, to define the phenotypes for a variety of mutations (gene knockout, over expression, etc.) of a specific biochemical network. This scenario is not well supported by existing model editors, forcing the user to manually generate, store, and maintain many variations of the same model. Results We developed an extension to our modeling editor called the JigCell Run Manager (JC-RM). JC-RM allows the modeler to define a hierarchy of parameter values, simulations, and plot settings, and to save them together with the initial model. JC-RM supports generation of simulation plots, as well as export to COPASI and SBML (L3V1) for further analysis. Conclusions Developing a model with its initial list of parameter values is just the first step in modeling a biological system. Models are often parameterized in many different ways to account for mutations of the organism and/or for sets of related experiments performed on the organism. JC-RM offers two critical features: it supports the everyday management of a large model, complete with its parameterizations, and it facilitates sharing this information before and after publication. JC-RM allows the modeler to define a hierarchy of parameter values, simulation, and plot settings, and to maintain a relationship between this hierarchy and the initial model. JC-RM is implemented in Java and uses the COPASI API. JC-RM runs on all major operating systems, with minimal system requirements. Installers, source code, user manual, and examples can be found at the COPASI website (http://www.copasi.org/Projects).
- Modeling stochasticity and variability in gene regulatory networksMurrugarra, David; Veliz-Cuba, Alan; Aguilar, Boris; Arat, Seda; Laubenbacher, Reinhard C. (2012-06-06)Modeling stochasticity in gene regulatory networks is an important and complex problem in molecular systems biology. To elucidate intrinsic noise, several modeling strategies such as the Gillespie algorithm have been used successfully. This article contributes an approach as an alternative to these classical settings. Within the discrete paradigm, where genes, proteins, and other molecular components of gene regulatory networks are modeled as discrete variables and are assigned as logical rules describing their regulation through interactions with other components. Stochasticity is modeled at the biological function level under the assumption that even if the expression levels of the input nodes of an update rule guarantee activation or degradation there is a probability that the process will not occur due to stochastic effects. This approach allows a finer analysis of discrete models and provides a natural setup for cell population simulations to study cell-to-cell variability. We applied our methods to two of the most studied regulatory networks, the outcome of lambda phage infection of bacteria and the p53-mdm2 complex.
- Multistate Model Builder (MSMB): a flexible editor for compact biochemical modelsPalmisano, Alida; Hoops, Stefan; Watson, Layne T.; Jones, Thomas C, Jr.; Tyson, John J.; Shaffer, Clifford A. (Biomed Central, 2014-04-04)Background Building models of molecular regulatory networks is challenging not just because of the intrinsic difficulty of describing complex biological processes. Writing a model is a creative effort that calls for more flexibility and interactive support than offered by many of today’s biochemical model editors. Our model editor MSMB -- Multistate Model Builder -- supports multistate models created using different modeling styles. Results MSMB provides two separate advances on existing network model editors. (1) A simple but powerful syntax is used to describe multistate species. This reduces the number of reactions needed to represent certain molecular systems, thereby reducing the complexity of model creation. (2) Extensive feedback is given during all stages of the model creation process on the existing state of the model. Users may activate error notifications of varying stringency on the fly, and use these messages as a guide toward a consistent, syntactically correct model. MSMB default values and behavior during model manipulation (e.g., when renaming or deleting an element) can be adapted to suit the modeler, thus supporting creativity rather than interfering with it. MSMB’s internal model representation allows saving a model with errors and inconsistencies (e.g., an undefined function argument; a syntactically malformed reaction). A consistent model can be exported to SBML or COPASI formats. We show the effectiveness of MSMB’s multistate syntax through models of the cell cycle and mRNA transcription. Conclusions Using multistate reactions reduces the number of reactions need to encode many biochemical network models. This reduces the cognitive load for a given model, thereby making it easier for modelers to build more complex models. The many interactive editing support features provided by MSMB make it easier for modelers to create syntactically valid models, thus speeding model creation. Complete information and the installation package can be found at http://www.copasi.org/SoftwareProjects. MSMB is based on Java and the COPASI API.
- A Network of SCOP Hidden Markov Models and Its AnalysisZhang, Liqing; Watson, Layne T.; Heath, Lenwood S. (2011-05-23)Background The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. Results In this work, taking into account the processes used to build the HMMs, we propose a working hypothesis to examine the relationships between HMMs and the families and superfamilies that they represent. Specifically, we perform an all-against-all HMM comparison using the HHsearch program (similar to BLAST) and construct a network where the nodes are HMMs and the edges connect similar HMMs. We hypothesize that the HMMs in a connected component belong to the same family or superfamily more often than expected under a random network connection model. Results show a pattern consistent with this working hypothesis. Moreover, the HMM network possesses features distinctly different from the previously documented biological networks, exemplified by the exceptionally high clustering coefficient and the large number of connected components. Conclusions The current finding may provide guidance in devising computational methods to reduce the degree of overlaps between the HMMs representing the same superfamilies, which may in turn enable more efficient large-scale sequence searches against the database of HMMs.
- Optimization and model reduction in the high dimensional parameter space of a budding yeast cell cycle modelOguz, Cihan; Laomettachit, Teeraphan; Chen, Katherine C.; Watson, Layne T.; Baumann, William T.; Tyson, John J. (Biomed Central, 2013-06-28)Background 'Parameter estimation from experimental data is critical for mathematical modeling of protein regulatory networks. For realistic networks with dozens of species and reactions, parameter estimation is an especially challenging task. In this study, we present an approach for parameter estimation that is effective in fitting a model of the budding yeast cell cycle (comprising 26 nonlinear ordinary differential equations containing 126 rate constants) to the experimentally observed phenotypes (viable or inviable) of 119 genetic strains carrying mutations of cell cycle genes. Results Starting from an initial guess of the parameter values, which correctly captures the phenotypes of only 72 genetic strains, our parameter estimation algorithm quickly improves the success rate of the model to 105-111 of the 119 strains. This success rate is comparable to the best values achieved by a skilled modeler manually choosing parameters over many weeks. The algorithm combines two search and optimization strategies. First, we use Latin hypercube sampling to explore a region surrounding the initial guess. From these samples, we choose ∼20 different sets of parameter values that correctly capture wild type viability. These sets form the starting generation of differential evolution that selects new parameter values that perform better in terms of their success rate in capturing phenotypes. In addition to producing highly successful combinations of parameter values, we analyze the results to determine the parameters that are most critical for matching experimental outcomes and the most competitive strains whose correct outcome with a given parameter vector forces numerous other strains to have incorrect outcomes. These “most critical parameters” and “most competitive strains” provide biological insights into the model. Conversely, the “least critical parameters” and “least competitive strains” suggest ways to reduce the computational complexity of the optimization. Conclusions Our approach proves to be a useful tool to help systems biologists fit complex dynamical models to large experimental datasets. In the process of fitting the model to the data, the tool identifies suggestive correlations among aspects of the model and the data.
- Predicting network modules of cell cycle regulators using relative protein abundance statisticsOguz, Cihan; Watson, Layne T.; Baumann, William T.; Tyson, John J. (2017-02-28)Background Parameter estimation in systems biology is typically done by enforcing experimental observations through an objective function as the parameter space of a model is explored by numerical simulations. Past studies have shown that one usually finds a set of “feasible” parameter vectors that fit the available experimental data equally well, and that these alternative vectors can make different predictions under novel experimental conditions. In this study, we characterize the feasible region of a complex model of the budding yeast cell cycle under a large set of discrete experimental constraints in order to test whether the statistical features of relative protein abundance predictions are influenced by the topology of the cell cycle regulatory network. Results Using differential evolution, we generate an ensemble of feasible parameter vectors that reproduce the phenotypes (viable or inviable) of wild-type yeast cells and 110 mutant strains. We use this ensemble to predict the phenotypes of 129 mutant strains for which experimental data is not available. We identify 86 novel mutants that are predicted to be viable and then rank the cell cycle proteins in terms of their contributions to cumulative variability of relative protein abundance predictions. Proteins involved in “regulation of cell size” and “regulation of G1/S transition” contribute most to predictive variability, whereas proteins involved in “positive regulation of transcription involved in exit from mitosis,” “mitotic spindle assembly checkpoint” and “negative regulation of cyclin-dependent protein kinase by cyclin degradation” contribute the least. These results suggest that the statistics of these predictions may be generating patterns specific to individual network modules (START, S/G2/M, and EXIT). To test this hypothesis, we develop random forest models for predicting the network modules of cell cycle regulators using relative abundance statistics as model inputs. Predictive performance is assessed by the areas under receiver operating characteristics curves (AUC). Our models generate an AUC range of 0.83-0.87 as opposed to randomized models with AUC values around 0.50. Conclusions By using differential evolution and random forest modeling, we show that the model prediction statistics generate distinct network module-specific patterns within the cell cycle network.
- Predicting the combined effect of multiple genetic variantsLiu, Mingming; Watson, Layne T.; Zhang, Liqing (2015-07-30)Background Many genetic variants have been identified in the human genome. The functional effects of a single variant have been intensively studied. However, the joint effects of multiple variants in the same genes have been largely ignored due to their complexity or lack of data. This paper uses HMMvar, a hidden Markov model based approach, to investigate the combined effect of multiple variants from the 1000 Genomes Project. Two tumor suppressor genes, TP53 and phosphatase and tensin homolog (PTEN), are also studied for the joint effect of compensatory indel variants. Results Results show that there are cases where the joint effect of having multiple variants in the same genes is significantly different from that of a single variant. The deleterious effect of a single indel variant can be alleviated by their compensatory indels in TP53 and PTEN. Compound mutations in two genes, β-MHC and MyBP-C, leading to severer cardiovascular disease compared to single mutations, are also validated. Conclusions This paper extends the functionality of HMMvar, a tool for assigning a quantitative score to a variant, to measure not only the deleterious effect of a single variant but also the joint effect of multiple variants. HMMvar is the first tool that can predict the functional effects of both single and general multiple variations on proteins. The precomputed scores for multiple variants from the 1000 Genomes Project and the HMMvar package are available at https://bioinformatics.cs.vt.edu/zhanglab/HMMvar/
- Quantitative prediction of the effect of genetic variation using hidden Markov modelsLiu, Mingming; Watson, Layne T.; Zhang, Liqing (2014-01-09)Background With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources. Results This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations. Conclusions This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at https://bioinformatics.cs.vt.edu/zhanglab/hmm.
- A Stochastic Model Correctly Predicts Changes in Budding Yeast Cell Cycle Dynamics upon Periodic Expression of CLN2Oguz, Cihan; Palmisano, Alida; Laomettachit, Teeraphan; Watson, Layne T.; Baumann, William T.; Tyson, John J. (PLOS, 2014-05-09)In this study, we focus on a recent stochastic budding yeast cell cycle model. First, we estimate the model parameters using extensive data sets: phenotypes of 110 genetic strains, single cell statistics of wild type and cln3 strains. Optimization of stochastic model parameters is achieved by an automated algorithm we recently used for a deterministic cell cycle model. Next, in order to test the predictive ability of the stochastic model, we focus on a recent experimental study in which forced periodic expression of CLN2 cyclin (driven by MET3 promoter in cln3 background) has been used to synchronize budding yeast cell colonies. We demonstrate that the model correctly predicts the experimentally observed synchronization levels and cell cycle statistics of mother and daughter cells under various experimental conditions (numerical data that is not enforced in parameter optimization), in addition to correctly predicting the qualitative changes in size control due to forced CLN2 expression. Our model also generates a novel prediction: under frequent CLN2 expression pulses, G1 phase duration is bimodal among small-born cells. These cells originate from daughters with extended budded periods due to size control during the budded period. This novel prediction and the experimental trends captured by the model illustrate the interplay between cell cycle dynamics, synchronization of cell colonies, and size control in budding yeast.