Browsing by Author "Cao, Yang"
Now showing 1 - 20 of 61
Results Per Page
Sort Options
- The Abridgment and Relaxation Time for a Linear Multi-Scale Model Based on Multiple Site PhosphorylationWang, Shuo; Cao, Yang (PLOS, 2015-08-11)Random effect in cellular systems is an important topic in systems biology and often simulated with Gillespie’s stochastic simulation algorithm (SSA). Abridgment refers to model reduction that approximates a group of reactions by a smaller group with fewer species and reactions. This paper presents a theoretical analysis, based on comparison of the first exit time, for the abridgment on a linear chain reaction model motivated by systems with multiple phosphorylation sites. The analysis shows that if the relaxation time of the fast subsystem is much smaller than the mean firing time of the slow reactions, the abridgment can be applied with little error. This analysis is further verified with numerical experiments for models of bistable switch and oscillations in which linear chain system plays a critical role.
- Adjoint based solution and uncertainty quantification techniques for variational inverse problemsHebbur Venkata Subba Rao, Vishwas (Virginia Tech, 2015-09-25)Variational inverse problems integrate computational simulations of physical phenomena with physical measurements in an informational feedback control system. Control parameters of the computational model are optimized such that the simulation results fit the physical measurements.The solution procedure is computationally expensive since it involves running the simulation computer model (the emph{forward model}) and the associated emph {adjoint model} multiple times. In practice, our knowledge of the underlying physics is incomplete and hence the associated computer model is laden with emph {model errors}. Similarly, it is not possible to measure the physical quantities exactly and hence the measurements are associated with emph {data errors}. The errors in data and model adversely affect the inference solutions. This work develops methods to address the challenges posed by the computational costs and by the impact of data and model errors in solving variational inverse problems. Variational inverse problems of interest here are formulated as optimization problems constrained by partial differential equations (PDEs). The solution process requires multiple evaluations of the constraints, therefore multiple solutions of the associated PDE. To alleviate the computational costs we develop a parallel in time discretization algorithm based on a nonlinear optimization approach. Like in the emph{parareal} approach, the time interval is partitioned into subintervals, and local time integrations are carried out in parallel. Solution continuity equations across interval boundaries are added as constraints. All the computational steps - forward solutions, gradients, and Hessian-vector products - involve only ideally parallel computations and therefore are highly scalable. This work develops a systematic mathematical framework to compute the impact of data and model errors on the solution to the variational inverse problems. The computational algorithm makes use of first and second order adjoints and provides an a-posteriori error estimate for a quantity of interest defined on the inverse solution (i.e., an aspect of the inverse solution). We illustrate the estimation algorithm on a shallow water model and on the Weather Research and Forecast model. Presence of outliers in measurement data is common, and this negatively impacts the solution to variational inverse problems. The traditional approach, where the inverse problem is formulated as a minimization problem in $L_2$ norm, is especially sensitive to large data errors. To alleviate the impact of data outliers we propose to use robust norms such as the $L_1$ and Huber norm in data assimilation. This work develops a systematic mathematical framework to perform three and four dimensional variational data assimilation using $L_1$ and Huber norms. The power of this approach is demonstrated by solving data assimilation problems where measurements contain outliers.
- Adjoint-based space-time adaptive solution algorithms for sensitivity analysis and inverse problemsAlexe, Mihai (Virginia Tech, 2011-03-18)Adaptivity in both space and time has become the norm for solving problems modeled by partial differential equations. The size of the discretized problem makes uniformly refined grids computationally prohibitive. Adaptive refinement of meshes and time steps allows to capture the phenomena of interest while keeping the cost of a simulation tractable on the current hardware. Many fields in science and engineering require the solution of inverse problems where parameters for a given model are estimated based on available measurement information. In contrast to forward (regular) simulations, inverse problems have not extensively benefited from the adaptive solver technology. Previous research in inverse problems has focused mainly on the continuous approach to calculate sensitivities, and has typically employed fixed time and space meshes in the solution process. Inverse problem solvers that make exclusive use of uniform or static meshes avoid complications such as the differentiation of mesh motion equations, or inconsistencies in the sensitivity equations between subdomains with different refinement levels. However, this comes at the cost of low computational efficiency. More efficient computations are possible through judicious use of adaptive mesh refinement, adaptive time steps, and the discrete adjoint method. This dissertation develops a complete framework for fully discrete adjoint sensitivity analysis and inverse problem solutions, in the context of time dependent, adaptive mesh, and adaptive step models. The discrete framework addresses all the necessary ingredients of a state–of–the–art adaptive inverse solution algorithm: adaptive mesh and time step refinement, solution grid transfer operators, a priori and a posteriori error analysis and estimation, and discrete adjoints for sensitivity analysis of flux–limited numerical algorithms.
- Advanced Sampling Methods for Solving Large-Scale Inverse ProblemsAttia, Ahmed Mohamed Mohamed (Virginia Tech, 2016-09-19)Ensemble and variational techniques have gained wide popularity as the two main approaches for solving data assimilation and inverse problems. The majority of the methods in these two approaches are derived (at least implicitly) under the assumption that the underlying probability distributions are Gaussian. It is well accepted, however, that the Gaussianity assumption is too restrictive when applied to large nonlinear models, nonlinear observation operators, and large levels of uncertainty. This work develops a family of fully non-Gaussian data assimilation algorithms that work by directly sampling the posterior distribution. The sampling strategy is based on a Hybrid/Hamiltonian Monte Carlo (HMC) approach that can handle non-normal probability distributions. The first algorithm proposed in this work is the "HMC sampling filter", an ensemble-based data assimilation algorithm for solving the sequential filtering problem. Unlike traditional ensemble-based filters, such as the ensemble Kalman filter and the maximum likelihood ensemble filter, the proposed sampling filter naturally accommodates non-Gaussian errors and nonlinear model dynamics, as well as nonlinear observations. To test the capabilities of the HMC sampling filter numerical experiments are carried out using the Lorenz-96 model and observation operators with different levels of nonlinearity and differentiability. The filter is also tested with shallow water model on the sphere with linear observation operator. Numerical results show that the sampling filter performs well even in highly nonlinear situations where the traditional filters diverge. Next, the HMC sampling approach is extended to the four-dimensional case, where several observations are assimilated simultaneously, resulting in the second member of the proposed family of algorithms. The new algorithm, named "HMC sampling smoother", is an ensemble-based smoother for four-dimensional data assimilation that works by sampling from the posterior probability density of the solution at the initial time. The sampling smoother naturally accommodates non-Gaussian errors and nonlinear model dynamics and observation operators, and provides a full description of the posterior distribution. Numerical experiments for this algorithm are carried out using a shallow water model on the sphere with observation operators of different levels of nonlinearity. The numerical results demonstrate the advantages of the proposed method compared to the traditional variational and ensemble-based smoothing methods. The HMC sampling smoother, in its original formulation, is computationally expensive due to the innate requirement of running the forward and adjoint models repeatedly. The proposed family of algorithms proceeds by developing computationally efficient versions of the HMC sampling smoother based on reduced-order approximations of the underlying model dynamics. The reduced-order HMC sampling smoothers, developed as extensions to the original HMC smoother, are tested numerically using the shallow-water equations model in Cartesian coordinates. The results reveal that the reduced-order versions of the smoother are capable of accurately capturing the posterior probability density, while being significantly faster than the original full order formulation. In the presence of nonlinear model dynamics, nonlinear observation operator, or non-Gaussian errors, the prior distribution in the sequential data assimilation framework is not analytically tractable. In the original formulation of the HMC sampling filter, the prior distribution is approximated by a Gaussian distribution whose parameters are inferred from the ensemble of forecasts. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. The base filter developed following this strategy is named cluster HMC sampling filter (ClHMC ). A multi-chain version of the ClHMC filter, namely MC-ClHMC , is also proposed to guarantee that samples are taken from the vicinities of all probability modes of the formulated posterior. These methodologies are tested using a quasi-geostrophic (QG) model with double-gyre wind forcing and bi-harmonic friction. Numerical results demonstrate the usefulness of using GMMs to relax the Gaussian prior assumption in the HMC filtering paradigm. To provide a unified platform for data assimilation research, a flexible and a highly-extensible testing suite, named DATeS , is developed and described in this work. The core of DATeS is implemented in Python to enable for Object-Oriented capabilities. The main components, such as the models, the data assimilation algorithms, the linear algebra solvers, and the time discretization routines are independent of each other, such as to offer maximum flexibility to configure data assimilation studies.
- Algorithms for Modeling Mass Movements and their Adoption in Social NetworksJin, Fang (Virginia Tech, 2016-08-23)Online social networks have become a staging ground for many modern movements, with the Arab Spring being the most prominent example. In an effort to understand and predict those movements, social media can be regarded as a valuable social sensor for disclosing underlying behaviors and patterns. To fully understand mass movement information propagation patterns in social networks, several problems need to be considered and addressed. Specifically, modeling mass movements that incorporate multiple spaces, a dynamic network structure, and misinformation propagation, can be exceptionally useful in understanding information propagation in social media. This dissertation explores four research problems underlying efforts to identify and track the adoption of mass movements in social media. First, how do mass movements become mobilized on Twitter, especially in a specific geographic area? Second, can we detect protest activity in social networks by observing group anomalies in graph? Third, how can we distinguish real movements from rumors or misinformation campaigns? and fourth, how can we infer the indicators of a specific type of protest, say climate related protest? A fundamental objective of this research has been to conduct a comprehensive study of how mass movement adoption functions in social networks. For example, it may cross multiple spaces, evolve with dynamic network structures, or consist of swift outbreaks or long term slowly evolving transmissions. In many cases, it may also be mixed with misinformation campaigns, either deliberate or in the form of rumors. Each of those issues requires the development of new mathematical models and algorithmic approaches such as those explored here. This work aims to facilitate advances in information propagation, group anomaly detection and misinformation distinction and, ultimately, help improve our understanding of mass movements and their adoption in social networks.
- Algorithms for Reconstructing and Reasoning about Chemical Reaction NetworksCho, Yong Ju (Virginia Tech, 2013-01-24)Recent advances in systems biology have uncovered detailed mechanisms of biological processes such as the cell cycle, circadian rhythms, and signaling pathways. These mechanisms are modeled by chemical reaction networks (CRNs) which are typically simulated by converting to ordinary differential equations (ODEs), so that the goal is to closely reproduce the observed quantitative and qualitative behaviors of the modeled process. This thesis proposes two algorithmic problems related to the construction and comprehension of CRN models. The first problem focuses on reconstructing CRNs from given time series. Given multivariate time course data obtained by perturbing a given CRN, how can we systematically deduce the interconnections between the species of the network? We demonstrate how this problem can be modeled as, first, one of uncovering conditional independence relationships using buffering experiments and, second, of determining the properties of the individual chemical reactions. Experimental results demonstrate the effectiveness of our approach on both synthetic and real CRNs. The second problem this work focuses on is to aid in network comprehension, i.e., to understand the motifs underlying complex dynamical behaviors of CRNs. Specifically, we focus on bistability---an important dynamical property of a CRN---and propose algorithms to identify the core structures responsible for conferring bistability. The approach we take is to systematically infer the instability causing structures (ICSs) of a CRN and use machine learning techniques to relate properties of the CRN to the presence of such ICSs. This work has the potential to aid in not just network comprehension but also model simplification, by helping reduce the complexity of known bistable systems.
- Analysis and Application of Haseltine and Rawlings's Hybrid Stochastic Simulation AlgorithmWang, Shuo (Virginia Tech, 2016-10-06)Stochastic effects in cellular systems are usually modeled and simulated with Gillespie's stochastic simulation algorithm (SSA), which follows the same theoretical derivation as the chemical master equation (CME), but the low efficiency of SSA limits its application to large chemical networks. To improve efficiency of stochastic simulations, Haseltine and Rawlings proposed a hybrid of ODE and SSA algorithm, which combines ordinary differential equations (ODEs) for traditional deterministic models and SSA for stochastic models. In this dissertation, accuracy analysis, efficient implementation strategies, and application of of Haseltine and Rawlings's hybrid method (HR) to a budding yeast cell cycle model are discussed. Accuracy of the hybrid method HR is studied based on a linear chain reaction system, motivated from the modeling practice used for the budding yeast cell cycle control mechanism. Mathematical analysis and numerical results both show that the hybrid method HR is accurate if either numbers of molecules of reactants in fast reactions are above certain thresholds, or rate constants of fast reactions are much larger than rate constants of slow reactions. Our analysis also shows that the hybrid method HR allows for a much greater region in system parameter space than those for the slow scale SSA (ssSSA) and the stochastic quasi steady state assumption (SQSSA) method. Implementation of the hybrid method HR requires a stiff ODE solver for numerical integration and an efficient event-handling strategy for slow reaction firings. In this dissertation, an event-handling strategy is developed based on inverse interpolation. Performances of five wildly used stiff ODE solvers are measured in three numerical experiments. Furthermore, inspired by the strategy of the hybrid method HR, a hybrid of ODE and SSA stochastic models for the budding yeast cell cycle is developed, based on a deterministic model in the literature. Simulation results of this hybrid model match very well with biological experimental data, and this model is the first to do so with these recently available experimental data. This study demonstrates that the hybrid method HR has great potential for stochastic modeling and simulation of large biochemical networks.
- Analysis and remedy of negativity problem in hybrid stochastic simulation algorithm and its applicationChen, Minghan; Cao, Yang (2019-06-20)Background The hybrid stochastic simulation algorithm, proposed by Haseltine and Rawlings (HR), is a combination of differential equations for traditional deterministic models and Gillespie’s algorithm (SSA) for stochastic models. The HR hybrid method can significantly improve the efficiency of stochastic simulations for multiscale biochemical networks. Previous studies on the accuracy analysis for a linear chain reaction system showed that the HR hybrid method is accurate if the scale difference between fast and slow reactions is above a certain threshold, regardless of population scales. However, the population of some reactant species might be driven negative if they are involved in both deterministic and stochastic systems. Results This work investigates the negativity problem of the HR hybrid method, analyzes and tests it with several models including a linear chain system, a nonlinear reaction system, and a realistic biological cell cycle system. As a benchmark, the second slow reaction firing time is used to measure the effect of negative populations on the accuracy of the HR hybrid method. Our analysis demonstrates that usually the error caused by negative populations is negligible compared with approximation errors of the HR hybrid method itself, and sometimes negativity phenomena may even improve the accuracy. But for systems where negative species are involved in nonlinear reactions or some species are highly sensitive to negative species, the system stability will be influenced and may lead to system failure when using the HR hybrid method. In those circumstances, three remedies are studied for the negativity problem. Conclusions The results of different models and examples suggest that the Zero-Reaction rule is a good remedy for nonlinear and sensitive systems considering its efficiency and simplicity.
- Applying the Newmark Method to the Discontinuous Deformation AnalysisPeng, Bo (Virginia Tech, 2014-12-08)Discontinuous deformation analysis (DDA) is a newly developed simulation method for discontinuous systems. It was designed to simulate systems with arbitrary shaped blocks with high efficiency while providing accurate solutions for energy dissipation. But DDA usually exhibits damping effects that are inconsistent with theoretical solutions. The deep reason for these artificial damping effects has been an open question, and it is hypothesized that these damping effects could result from the time integration scheme. In this thesis two time integration methods are investigated: the forward Euler method and the Newmark method. The work begins by combining the Newmark method and the DDA. An integrated Newmark method is also developed, where velocity and acceleration do not need to be updated. In simulations, two of the most widely used models are adopted to test the forward Euler method and the Newmark method. The first one is a sliding model, in which both the forward Euler method and the Newmark method give accurate solutions compared with analytical results. The second model is an impacting model, in which the Newmark method has much better accuracy than the forward Euler method, and there are minimal damping effects.
- The Art of Modeling and Simulation of Multiscale Biochemical SystemsPu, Yang (Virginia Tech, 2015-05-14)In this thesis we study modeling and simulation approaches for multiscale biochemical systems. The thesis addresses both modeling methods and simulation strategies. In the first part, we propose modeling methods to study the behavior of the insulin secretion pathway. We first expand the single cell model proposed by Bertram et. al. to model multiple cells. Synchronization among multiple cells is observed. Then an unhealthy cell model is proposed to study the insulin secretion failure caused by weakening of mitochondria function. By studying the interaction between the healthy and unhealthy cells, we find that the insulin secretion can be reinstated by increasing the glucokinase level. This new discovery sheds light on antidiabetic medication. In order to study the stochastic dynamics of the insulin secretion pathway, we first apply the hybrid method to model the discrete events in the insulin secretion pathway. Based on the hybrid model, a probability based measurement is proposed and applied to test the new antidiabetic remedy. In the second part, we focus on different simulation schemes for multiscale biochemical systems. We first propose a partitioning strategy for the hybrid method which leads to an efficient way of building stochastic cell cycle models. Then different implementation methods for the hybrid method are studied. A root finding method based on inverse interpolation is introduced to implement the hybrid method with three different ODE solvers. A detailed discussion of the performance of these three ODE solvers is presented. Last, we propose a new strategy to automatically detect stiffness and identify species that cause stiffness for the Tau-Leaping method, as well as two stiffness reduction methods. The efficiency is demonstrated by applying this new strategy on a stiff decaying dimerization model and a heat shock protein regulation model.
- Bridging the Gap between Deterministic and Stochastic Modeling with Automatic Scaling and ConversionWang, Pengyuan (Virginia Tech, 2008-05-07)During the past decade, many successful deterministic models of macromolecular regulatory networks have been built. Deterministic simulations of these models can show only average dynamics of the systems. However, stochastic simulations of macromolecular regulatory models can account for behaviors that are introduced by the noisy nature of the systems but not revealed by deterministic simulations. Thus, converting an existing model of value from the most common deterministic formulation to one suitable for stochastic simulation enables further investigation of the regulatory network. Although many different stochastic models can be developed and evolved from deterministic models, a direct conversion is the first step in practice. This conversion process is tedious and error-prone, especially for complex models. Thus, we seek to automate as much of the conversion process as possible. However, deterministic models often omit key information necessary for a stochastic formulation. Specifically, values in the model have to be scaled before a complete conversion, and the scaling factors are typically not given in the deterministic model. Several functionalities helping model scaling and converting are introduced and implemented in the JigCell modeling environment. Our tool makes it easier for the modeler to include complete details as well as to convert the model. Stochastic simulations are known for being computationally intensive, and thus require high performance computing facilities to be practical. With parallel computation on Virginia Tech's System X supercomputer, we are able to obtain the first stochastic simulation results for realistic cell cycle models. Stochastic simulation results for several mutants, which are thought to be biologically significant, are presented. Successful deployment of the enhanced modeling environment demonstrates the power of our techniques.
- Cell cycle control and environmental response by second messengers in Caulobacter crescentusXu, Chunrui; Weston, Bronson R.; Tyson, John J.; Cao, Yang (2020-09-30)Background Second messengers, c-di-GMP and (p)ppGpp, are vital regulatory molecules in bacteria, influencing cellular processes such as biofilm formation, transcription, virulence, quorum sensing, and proliferation. While c-di-GMP and (p)ppGpp are both synthesized from GTP molecules, they play antagonistic roles in regulating the cell cycle. In C. crescentus, c-di-GMP works as a major regulator of pole morphogenesis and cell development. It inhibits cell motility and promotes S-phase entry by inhibiting the activity of the master regulator, CtrA. Intracellular (p)ppGpp accumulates under starvation, which helps bacteria to survive under stressful conditions through regulating nucleotide levels and halting proliferation. (p)ppGpp responds to nitrogen levels through RelA-SpoT homolog enzymes, detecting glutamine concentration using a nitrogen phosphotransferase system (PTS Ntr). This work relates the guanine nucleotide-based second messenger regulatory network with the bacterial PTS Ntr system and investigates how bacteria respond to nutrient availability. Results We propose a mathematical model for the dynamics of c-di-GMP and (p)ppGpp in C. crescentus and analyze how the guanine nucleotide-based second messenger system responds to certain environmental changes communicated through the PTS Ntr system. Our mathematical model consists of seven ODEs describing the dynamics of nucleotides and PTS Ntr enzymes. Our simulations are consistent with experimental observations and suggest, among other predictions, that SpoT can effectively decrease c-di-GMP levels in response to nitrogen starvation just as well as it increases (p)ppGpp levels. Thus, the activity of SpoT (or its homologues in other bacterial species) can likely influence the cell cycle by influencing both c-di-GMP and (p)ppGpp. Conclusions In this work, we integrate current knowledge and experimental observations from the literature to formulate a novel mathematical model. We analyze the model and demonstrate how the PTS Ntr system influences (p)ppGpp, c-di-GMP, GMP and GTP concentrations. While this model does not consider all aspects of PTS Ntr signaling, such as cross-talk with the carbon PTS system, here we present our first effort to develop a model of nutrient signaling in C. crescentus.
- Cell Cycle Modeling for Budding Yeast with Stochastic Simulation AlgorithmsAhn, Tae-Hyuk; Watson, Layne T.; Cao, Yang; Shaffer, Clifford A.; Baumann, William T. (Department of Computer Science, Virginia Polytechnic Institute & State University, 2008-11-01)For biochemical systems, where some chemical species are represented by small numbers of molecules, discrete and stochastic approaches are more appropriate than continuous and deterministic approaches. The continuous deterministic approach using ordinary differential equations is adequate for understanding the average behavior of cells, while the discrete stochastic approach accurately captures noisy events in the growth-division cycle. Since the emergence of the stochastic simulation algorithm (SSA) by Gillespie, alternative algorithms have been developed whose goal is to improve the computational efficiency of the SSA. This paper explains and empirically compares the performance of some of these SSA alternatives on a realistic model. The budding yeast cell cycle provides an excellent example of the need for modeling stochastic effects in mathematical modeling of biochemical reactions. This paper presents a stochastic approximation of the cell cycle for budding yeast using Gillespie’s stochastic simulation algorithm. To compare the stochastic results with the average behavior, the simulation must be run thousands of times. Many of the proposed techniques to accelerate the SSA are not effective on the budding yeast problem, because of the scale of the problem or because underlying assumptions are not satisfied. A load balancing algorithm improved overall performance on a parallel supercomputer.
- Circuit Design Methods with Emerging NanotechnologiesZheng, Yexin (Virginia Tech, 2009-12-08)As complementary metal-oxide semiconductor (CMOS) technology faces more and more severe physical barriers down the path of continuously feature size scaling, innovative nano-scale devices and other post-CMOS technologies have been developed to enhance future circuit design and computation. These nanotechnologies have shown promising potentials to achieve magnitude improvement in performance and integration density. The substitution of CMOS transistors with nano-devices is expected to not only continue along the exponential projection of Moore's Law, but also raise significant challenges and opportunities, especially in the field of electronic design automation. The major obstacles that the designers are experiencing with emerging nanotechnology design include: i) the existing computer-aided design (CAD) approaches in the context of conventional CMOS Boolean design cannot be directly employed in the nanoelectronic design process, because the intrinsic electrical characteristics of many nano-devices are not best suited for Boolean implementations but demonstrate strong capability for implementing non-conventional logic such as threshold logic and reversible logic; ii) due to the density and size factors of nano-devices, the defect rate of nanoelectronic system is much higher than conventional CMOS systems, therefore existing design paradigms cannot guarantee design quality and lead to even worse result in high failure ratio. Motivated by the compelling potentials and design challenges of emerging post-CMOS technologies, this dissertation work focuses on fundamental design methodologies to effectively and efficiently achieve high quality nanoscale design. A novel programmable logic element (PLE) is first proposed to explore the versatile functionalities of threshold gates (TGs) and multi-threshold threshold gates (MTTGs). This PLE structure can realize all three- or four-variable logic functions through configuring binary control bits. This is the first single threshold logic structure that provides complete Boolean logic implementation. Based on the PLEs, a reconfigurable architecture is constructed to offer dynamic reconfigurability with little or no reconfiguration overhead, due to the intrinsic self-latching property of nanopipelining. Our reconfiguration data generation algorithm can further reduce the reconfiguration cost. To fully take advantage of such threshold logic design using emerging nanotechnologies, we also developed a combinational equivalence checking (CEC) framework for threshold logic design. Based on the features of threshold logic gates and circuits, different techniques of formulating a given threshold logic in conjunctive normal form (CNF) are introduced to facilitate efficient SAT-based verification. Evaluated with mainstream benchmarks, our hybrid algorithm, which takes into account both input symmetry and input weight order of threshold gates, can efficiently generate CNF formulas in terms of both SAT solving time and CNF generating time. Then the reversible logic synthesis problem is considered as we focus on efficient synthesis heuristics which can provide high quality synthesis results within a reasonable computation time. We have developed a weighted directed graph model for function representation and complexity measurement. An atomic transformation is constructed to associate the function complexity variation with reversible gates. The efficiency of our heuristic lies in maximally decreasing the function complexity during synthesis steps as well as the capability to climb out of local optimums. Thereafter, swarm intelligence, one of the machine learning techniques is employed in the space searching for reversible logic synthesis, which achieves further performance improvement. To tackle the high defect-rate during the emerging nanotechnology manufacturing process, we have developed a novel defect-aware logic mapping framework for nanowire-based PLA architecture via Boolean satisfiability (SAT). The PLA defects of various types are formulated as covering and closure constraints. The defect-aware logic mapping is then solved efficiently by using available SAT solvers. This approach can generate valid logic mapping with a defect rate as high as 20%. The proposed method is universally suitable for various nanoscale PLAs, including AND/OR, NOR/NOR structures, etc. In summary, this work provides some initial attempts to address two major problems confronting future nanoelectronic system designs: the development of electronic design automation tools and the reliability issues. However, there are still a lot of challenging open questions remain in this emerging and promising area. We hope our work can lay down stepstones on nano-scale circuit design optimization through exploiting the distinctive characteristics of emerging nanotechnologies.
- Computational Framework for Uncertainty Quantification, Sensitivity Analysis and Experimental Design of Network-based Computer Simulation ModelsWu, Sichao (Virginia Tech, 2017-08-29)When capturing a real-world, networked system using a simulation model, features are usually omitted or represented by probability distributions. Verification and validation (V and V) of such models is an inherent and fundamental challenge. Central to V and V, but also to model analysis and prediction, are uncertainty quantification (UQ), sensitivity analysis (SA) and design of experiments (DOE). In addition, network-based computer simulation models, as compared with models based on ordinary and partial differential equations (ODE and PDE), typically involve a significantly larger volume of more complex data. Efficient use of such models is challenging since it requires a broad set of skills ranging from domain expertise to in-depth knowledge including modeling, programming, algorithmics, high- performance computing, statistical analysis, and optimization. On top of this, the need to support reproducible experiments necessitates complete data tracking and management. Finally, the lack of standardization of simulation model configuration formats presents an extra challenge when developing technology intended to work across models. While there are tools and frameworks that address parts of the challenges above, to the best of our knowledge, none of them accomplishes all this in a model-independent and scientifically reproducible manner. In this dissertation, we present a computational framework called GENEUS that addresses these challenges. Specifically, it incorporates (i) a standardized model configuration format, (ii) a data flow management system with digital library functions helping to ensure scientific reproducibility, and (iii) a model-independent, expandable plugin-type library for efficiently conducting UQ/SA/DOE for network-based simulation models. This framework has been applied to systems ranging from fundamental graph dynamical systems (GDSs) to large-scale socio-technical simulation models with a broad range of analyses such as UQ and parameter studies for various scenarios. Graph dynamical systems provide a theoretical framework for network-based simulation models and have been studied theoretically in this dissertation. This includes a broad range of stability and sensitivity analyses offering insights into how GDSs respond to perturbations of their key components. This stability-focused, structure-to-function theory was a motivator for the design and implementation of GENEUS. GENEUS, rooted in the framework of GDS, provides modelers, experimentalists, and research groups access to a variety of UQ/SA/DOE methods with robust and tested implementations without requiring them to necessarily have the detailed expertise in statistics, data management and computing. Even for research teams having all the skills, GENEUS can significantly increase research productivity.
- Computational modeling of unphosphorylated CtrA:Cori binding in the Caulobacter cell cycleWeston, Bronson R.; Tyson, John J.; Cao, Yang (Cell Press, 2021-12-17)In the alphaproteobacterium, Caulobacter crescentus, phosphorylated CtrA (CtrA similar to P), a master regulatory protein, binds directly to the chromosome origin (Cori) to inhibit DNA replication. Using a mathematical model of CtrA binding at Cori site [d], we provide computational evidence that CtrA(U) can disc ace CtrA similar to P from Cori at the G1-S transition. Investigation of this interaction within a detailed model of the C. crescentus cell cycle suggests that CckA phosphatase may clear Cori of CtrA similar to P by altering the [CtrA(U)]/[CtrA similar to P] ratio rather than by completely depleting CtrA similar to P. Model analysis reveals that the mechanism allows for a speedier transition into S phase, stabilizes the timing of chromosome replication under fluctuating rates of CtrA proteolysis, and may contribute to the viability of numerous mutant strains. Overall, these results suggest that CtrA(U) enhances the robustness of chromosome replication. More generally, our proposed regulation of CtrA:Cori dynamics may represent a novel motif for molecular signaling in cell physiology.
- Computational Tools for Chemical Data Assimilation with CMAQGou, Tianyi (Virginia Tech, 2010-01-11)The Community Multiscale Air Quality (CMAQ) system is the Environmental Protection Agency's main modeling tool for atmospheric pollution studies. CMAQ-ADJ, the adjoint model of CMAQ, offers new analysis capabilities such as receptor-oriented sensitivity analysis and chemical data assimilation. This thesis presents the construction, validation, and properties of new adjoint modules in CMAQ, and illustrates their use in sensitivity analyses and data assimilation experiments. The new module of discrete adjoint of advection is implemented with the aid of automatic differentiation tool (TAMC) and is fully validated by comparing the adjoint sensitivities with finite difference values. In addition, adjoint sensitivity with respect to boundary conditions and boundary condition scaling factors are developed and validated in CMAQ. To investigate numerically the impact of the continuous and discrete advection adjoints on data assimilation, various four dimensional variational (4D-Var) data assimilation experiments are carried out with the 1D advection PDE, and with CMAQ advection using synthetic and real observation data. The results show that optimization procedure gives better estimates of the reference initial condition and converges faster when using gradients computed by the continuous adjoint approach. This counter-intuitive result is explained using the nonlinearity properties of the piecewise parabolic method (the numerical discretization of advection in CMAQ). Data assimilation experiments are carried out using real observation data. The simulation domain encompasses Texas and the simulation period is August 30 to September 1, 2006. Data assimilation is used to improve both initial and boundary conditions. These experiments further validate the tools developed in this thesis.
- Effective and Efficient Methodologies for Social Network AnalysisPan, Long (Virginia Tech, 2007-12-11)Performing social network analysis (SNA) requires a set of powerful techniques to analyze structural information contained in interactions between social entities. Many SNA technologies and methodologies have been developed and have successfully provided significant insights for small-scale interactions. However, these techniques are not suitable for analyzing large social networks, which are very popular and important in various fields and have special structural properties that cannot be obtained from small networks or their analyses. There are a number of issues that need to be further studied in the design of current SNA techniques. A number of key issues can be embodied in three fundamental and critical challenges: long processing time, large computational resource requirements, and network dynamism. In order to address these challenges, we discuss an anytime-anywhere methodology based on a parallel/distributed computational framework to effectively and efficiently analyze large and dynamic social networks. In our methodology, large social networks are decomposed into intra-related smaller parts. A coarse-level of network analysis is built based on comprehensively analyzing each part. The partial analysis results are incrementally refined over time. Also, during the analyses process, network dynamic changes are effectively and efficiently adapted based on the obtained results. In order to evaluate and validate our methodology, we implement our methodology for a set of SNA metrics which are significant for SNA applications and cover a wide range of difficulties. Through rigorous theoretical and experimental analyses, we demonstrate that our anytime-anywhere methodology is
- Efficient Computational Tools for Variational Data Assimilation and Information Content EstimationSingh, Kumaresh (Virginia Tech, 2010-08-10)The overall goals of this dissertation are to advance the field of chemical data assimilation, and to develop efficient computational tools that allow the atmospheric science community benefit from state of the art assimilation methodologies. Data assimilation is the procedure to combine data from observations with model predictions to obtain a more accurate representation of the state of the atmosphere. As models become more complex, determining the relationships between pollutants and their sources and sinks becomes computationally more challenging. The construction of an adjoint model ( capable of efficiently computing sensitivities of a few model outputs with respect to many input parameters ) is a difficult, labor intensive, and error prone task. This work develops adjoint systems for two of the most widely used chemical transport models: Harvard's GEOS-Chem global model and for Environmental Protection Agency's regional CMAQ regional air quality model. Both GEOS-Chem and CMAQ adjoint models are now used by the atmospheric science community to perform sensitivity analysis and data assimilation studies. Despite the continuous increase in capabilities, models remain imperfect and models alone cannot provide accurate long term forecasts. Observations of the atmospheric composition are now routinely taken from sondes, ground stations, aircraft, and satellites, etc. This work develops three and four dimensional variational data assimilation capabilities for GEOS-Chem and CMAQ which allow to estimate chemical states that best fit the observed reality. Most data assimilation systems to date use diagonal approximations of the background covariance matrix which ignore error correlations and may lead to inaccurate estimates. This dissertation develops computationally efficient representations of covariance matrices that allow to capture spatial error correlations in data assimilation. Not all observations used in data assimilation are of equal importance. Erroneous and redundant observations not only affect the quality of an estimate but also add unnecessary computational expense to the assimilation system. This work proposes techniques to quantify the information content of observations used in assimilation; information-theoretic metrics are used. The four dimensional variational approach to data assimilation provides accurate estimates but requires an adjoint construction, and uses considerable computational resources. This work studies versions of the four dimensional variational methods (Quasi 4D-Var) that use approximate gradients and are less expensive to develop and run. Variational and Kalman filter approaches are both used in data assimilation, but their relative merits and disadvantages in the context of chemical data assimilation have not been assessed. This work provides a careful comparison on a chemical assimilation problem with real data sets. The assimilation experiments performed here demonstrate for the first time the benefit of using satellite data to improve estimates of tropospheric ozone.
- Efficient formulation and implementation of ensemble based methods in data assimilationNino Ruiz, Elias David (Virginia Tech, 2016-01-11)Ensemble-based methods have gained widespread popularity in the field of data assimilation. An ensemble of model realizations encapsulates information about the error correlations driven by the physics and the dynamics of the numerical model. This information can be used to obtain improved estimates of the state of non-linear dynamical systems such as the atmosphere and/or the ocean. This work develops efficient ensemble-based methods for data assimilation. A major bottleneck in ensemble Kalman filter (EnKF) implementations is the solution of a linear system at each analysis step. To alleviate it an EnKF implementation based on an iterative Sherman Morrison formula is proposed. The rank deficiency of the ensemble covariance matrix is exploited in order to efficiently compute the analysis increments during the assimilation process. The computational effort of the proposed method is comparable to those of the best EnKF implementations found in the current literature. The stability analysis of the new algorithm is theoretically proven based on the positiveness of the data error covariance matrix. In order to improve the background error covariance matrices in ensemble-based data assimilation we explore the use of shrinkage covariance matrix estimators from ensembles. The resulting filter has attractive features in terms of both memory usage and computational complexity. Numerical results show that it performs better that traditional EnKF formulations. In geophysical applications the correlations between errors corresponding to distant model components decreases rapidly with the distance. We propose a new and efficient implementation of the EnKF based on a modified Cholesky decomposition for inverse covariance matrix estimation. This approach exploits the conditional independence of background errors between distant model components with regard to a predefined radius of influence. Consequently, sparse estimators of the inverse background error covariance matrix can be obtained. This implies huge memory savings during the assimilation process under realistic weather forecast scenarios. Rigorous error bounds for the resulting estimator in the context of data assimilation are theoretically proved. The conclusion is that the resulting estimator converges to the true inverse background error covariance matrix when the ensemble size is of the order of the logarithm of the number of model components. We explore high-performance implementations of the proposed EnKF algorithms. When the observational operator can be locally approximated for different regions of the domain, efficient parallel implementations of the EnKF formulations presented in this dissertation can be obtained. The parallel computation of the analysis increments is performed making use of domain decomposition. Local analysis increments are computed on (possibly) different processors. Once all local analysis increments have been computed they are mapped back onto the global domain to recover the global analysis. Tests performed with an atmospheric general circulation model at a T-63 resolution, and varying the number of processors from 96 to 2,048, reveal that the assimilation time can be decreased multiple fold for all the proposed EnKF formulations.Ensemble-based methods can be used to reformulate strong constraint four dimensional variational data assimilation such as to avoid the construction of adjoint models, which can be complicated for operational models. We propose a trust region approach based on ensembles in which the analysis increments are computed onto the space of an ensemble of snapshots. The quality of the resulting increments in the ensemble space is compared against the gains in the full space. Decisions on whether accept or reject solutions rely on trust region updating formulas. Results based on a atmospheric general circulation model with a T-42 resolution reveal that this methodology can improve the analysis accuracy.