Computational Science Laboratory

Permanent URI for this community

https://hdl.handle.net/10919/102074

The mission of the Computational Science Laboratory (CSL) is to develop innovative computational solutions for complex real-world problems, and to foster a productive research and education environment emphasizing collaboration and innovation.

Browse

Now showing 1 - 2 of 2

Advanced Sampling Methods for Solving Large-Scale Inverse Problems
Attia, Ahmed Mohamed Mohamed (Virginia Tech, 2016-09-19)
Ensemble and variational techniques have gained wide popularity as the two main approaches for solving data assimilation and inverse problems. The majority of the methods in these two approaches are derived (at least implicitly) under the assumption that the underlying probability distributions are Gaussian. It is well accepted, however, that the Gaussianity assumption is too restrictive when applied to large nonlinear models, nonlinear observation operators, and large levels of uncertainty. This work develops a family of fully non-Gaussian data assimilation algorithms that work by directly sampling the posterior distribution. The sampling strategy is based on a Hybrid/Hamiltonian Monte Carlo (HMC) approach that can handle non-normal probability distributions. The first algorithm proposed in this work is the "HMC sampling filter", an ensemble-based data assimilation algorithm for solving the sequential filtering problem. Unlike traditional ensemble-based filters, such as the ensemble Kalman filter and the maximum likelihood ensemble filter, the proposed sampling filter naturally accommodates non-Gaussian errors and nonlinear model dynamics, as well as nonlinear observations. To test the capabilities of the HMC sampling filter numerical experiments are carried out using the Lorenz-96 model and observation operators with different levels of nonlinearity and differentiability. The filter is also tested with shallow water model on the sphere with linear observation operator. Numerical results show that the sampling filter performs well even in highly nonlinear situations where the traditional filters diverge. Next, the HMC sampling approach is extended to the four-dimensional case, where several observations are assimilated simultaneously, resulting in the second member of the proposed family of algorithms. The new algorithm, named "HMC sampling smoother", is an ensemble-based smoother for four-dimensional data assimilation that works by sampling from the posterior probability density of the solution at the initial time. The sampling smoother naturally accommodates non-Gaussian errors and nonlinear model dynamics and observation operators, and provides a full description of the posterior distribution. Numerical experiments for this algorithm are carried out using a shallow water model on the sphere with observation operators of different levels of nonlinearity. The numerical results demonstrate the advantages of the proposed method compared to the traditional variational and ensemble-based smoothing methods. The HMC sampling smoother, in its original formulation, is computationally expensive due to the innate requirement of running the forward and adjoint models repeatedly. The proposed family of algorithms proceeds by developing computationally efficient versions of the HMC sampling smoother based on reduced-order approximations of the underlying model dynamics. The reduced-order HMC sampling smoothers, developed as extensions to the original HMC smoother, are tested numerically using the shallow-water equations model in Cartesian coordinates. The results reveal that the reduced-order versions of the smoother are capable of accurately capturing the posterior probability density, while being significantly faster than the original full order formulation. In the presence of nonlinear model dynamics, nonlinear observation operator, or non-Gaussian errors, the prior distribution in the sequential data assimilation framework is not analytically tractable. In the original formulation of the HMC sampling filter, the prior distribution is approximated by a Gaussian distribution whose parameters are inferred from the ensemble of forecasts. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. The base filter developed following this strategy is named cluster HMC sampling filter (ClHMC ). A multi-chain version of the ClHMC filter, namely MC-ClHMC , is also proposed to guarantee that samples are taken from the vicinities of all probability modes of the formulated posterior. These methodologies are tested using a quasi-geostrophic (QG) model with double-gyre wind forcing and bi-harmonic friction. Numerical results demonstrate the usefulness of using GMMs to relax the Gaussian prior assumption in the HMC filtering paradigm. To provide a unified platform for data assimilation research, a flexible and a highly-extensible testing suite, named DATeS , is developed and described in this work. The core of DATeS is implemented in Python to enable for Object-Oriented capabilities. The main components, such as the models, the data assimilation algorithms, the linear algebra solvers, and the time discretization routines are independent of each other, such as to offer maximum flexibility to configure data assimilation studies.
Efficient formulation and implementation of ensemble based methods in data assimilation
Nino Ruiz, Elias David (Virginia Tech, 2016-01-11)
Ensemble-based methods have gained widespread popularity in the field of data assimilation. An ensemble of model realizations encapsulates information about the error correlations driven by the physics and the dynamics of the numerical model. This information can be used to obtain improved estimates of the state of non-linear dynamical systems such as the atmosphere and/or the ocean. This work develops efficient ensemble-based methods for data assimilation. A major bottleneck in ensemble Kalman filter (EnKF) implementations is the solution of a linear system at each analysis step. To alleviate it an EnKF implementation based on an iterative Sherman Morrison formula is proposed. The rank deficiency of the ensemble covariance matrix is exploited in order to efficiently compute the analysis increments during the assimilation process. The computational effort of the proposed method is comparable to those of the best EnKF implementations found in the current literature. The stability analysis of the new algorithm is theoretically proven based on the positiveness of the data error covariance matrix. In order to improve the background error covariance matrices in ensemble-based data assimilation we explore the use of shrinkage covariance matrix estimators from ensembles. The resulting filter has attractive features in terms of both memory usage and computational complexity. Numerical results show that it performs better that traditional EnKF formulations. In geophysical applications the correlations between errors corresponding to distant model components decreases rapidly with the distance. We propose a new and efficient implementation of the EnKF based on a modified Cholesky decomposition for inverse covariance matrix estimation. This approach exploits the conditional independence of background errors between distant model components with regard to a predefined radius of influence. Consequently, sparse estimators of the inverse background error covariance matrix can be obtained. This implies huge memory savings during the assimilation process under realistic weather forecast scenarios. Rigorous error bounds for the resulting estimator in the context of data assimilation are theoretically proved. The conclusion is that the resulting estimator converges to the true inverse background error covariance matrix when the ensemble size is of the order of the logarithm of the number of model components. We explore high-performance implementations of the proposed EnKF algorithms. When the observational operator can be locally approximated for different regions of the domain, efficient parallel implementations of the EnKF formulations presented in this dissertation can be obtained. The parallel computation of the analysis increments is performed making use of domain decomposition. Local analysis increments are computed on (possibly) different processors. Once all local analysis increments have been computed they are mapped back onto the global domain to recover the global analysis. Tests performed with an atmospheric general circulation model at a T-63 resolution, and varying the number of processors from 96 to 2,048, reveal that the assimilation time can be decreased multiple fold for all the proposed EnKF formulations.Ensemble-based methods can be used to reformulate strong constraint four dimensional variational data assimilation such as to avoid the construction of adjoint models, which can be complicated for operational models. We propose a trust region approach based on ensembles in which the analysis increments are computed onto the space of an ensemble of snapshots. The quality of the resulting increments in the ensemble space is compared against the gains in the full space. Decisions on whether accept or reject solutions rely on trust region updating formulas. Results based on a atmospheric general circulation model with a T-42 resolution reveal that this methodology can improve the analysis accuracy.

Browse

Browsing Computational Science Laboratory by Author "Anderson, Jeffrey L."

Results Per Page

Sort Options