Computational Science Laboratory
Permanent URI for this community
The mission of the Computational Science Laboratory (CSL) is to
develop innovative computational solutions for complex real-world problems, and to
foster a productive research and education environment emphasizing collaboration and innovation.
Browse
Browsing Computational Science Laboratory by Author "Baumann, William T."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Computational Techniques for the Analysis of Large Scale Biological SystemsAhn, Tae-Hyuk (Virginia Tech, 2016-09-27)An accelerated pace of discovery in biological sciences is made possible by a new generation of computational biology and bioinformatics tools. In this dissertation we develop novel computational, analytical, and high performance simulation techniques for biological problems, with applications to the yeast cell division cycle, and to the RNA-Sequencing of the yellow fever mosquito. Cell cycle system evolves stochastic effects when there are a small number of molecules react each other. Consequently, the stochastic effects of the cell cycle are important, and the evolution of cells is best described statistically. Stochastic simulation algorithm (SSA), the standard stochastic method for chemical kinetics, is often slow because it accounts for every individual reaction event. This work develops a stochastic version of a deterministic cell cycle model, in order to capture the stochastic aspects of the evolution of the budding yeast wild-type and mutant strain cells. In order to efficiently run large ensembles to compute statistics of cell evolution, the dissertation investigates parallel simulation strategies, and presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms. This work also proposes new accelerated stochastic simulation algorithms based on a fully implicit approach and on stochastic Taylor expansions. Next Generation RNA-Sequencing, a high-throughput technology to sequence cDNA in order to get information about a sample's RNA content, is becoming an efficient genomic approach to uncover new genes and to study gene expression and alternative splicing. This dissertation develops efficient algorithms and strategies to find new genes in Aedes aegypti, which is the most important vector of dengue fever and yellow fever. We report the discovery of a large number of new gene transcripts, and the identification and characterization of genes that showed male-biased expression profiles. This basic information may open important avenues to control mosquito borne infectious diseases.
- A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic SimulationsAhn, Tae-Hyuk; Sandu, Adrian; Watson, Layne T.; Shaffer, Clifford A.; Cao, Yang; Baumann, William T. (Department of Computer Science, Virginia Polytechnic Institute & State University, 2012)Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model is consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25%, and the total processor idle time by 85%.
- Parallel Load Balancing Strategies for Ensembles of Stochastic Biochemical SimulationsAhn, Tae-Hyuk; Sandu, Adrian; Watson, Layne T.; Shaffer, Clifford A.; Cao, Yang; Baumann, William T. (Department of Computer Science, Virginia Polytechnic Institute & State University, 2010-12-01)The evolution of biochemical systems where some chemical species are present with only a small number of molecules, is strongly influenced by discrete and stochastic effects that cannot be accurately captured by continuous and deterministic models. The budding yeast cell cycle provides an excellent example of the need to account for stochastic effects in biochemical reactions. To obtain statistics of the cell cycle progression, a stochastic simulation algorithm must be run thousands of times with different initial conditions and parameter values. In order to manage the computational expense involved, the large ensemble of runs needs to be executed in parallel. The CPU time for each individual task is unknown before execution, so a simple strategy of assigning an equal number of tasks per processor can lead to considerable work imbalances and loss of parallel efficiency. Moreover, deterministic analysis approaches are ill suited for assessing the effectiveness of load balancing algorithms in this context. Biological models often require stochastic simulation. Since generating an ensemble of simulation results is computationally intensive, it is important to make efficient use of computer resources. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms when applied to large ensembles of stochastic biochemical simulations. Two particular load balancing strategies (point-to-point and all-redistribution) are discussed in detail. Simulation results with a stochastic budding yeast cell cycle model confirm the theoretical analysis. While this work is motivated by cell cycle modeling, the proposed analysis framework is general and can be directly applied to any ensemble simulation of biological systems where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks.