Efficient computer experiment designs for Gaussian process surrogates

dc.contributor.authorCole, David Austinen
dc.contributor.committeechairGramacy, Robert B.en
dc.contributor.committeememberDeng, Xinweien
dc.contributor.committeememberHouse, Leanna L.en
dc.contributor.committeememberHigdon, Daviden
dc.description.abstractDue to advancements in supercomputing and algorithms for finite element analysis, today's computer simulation models often contain complex calculations that can result in a wealth of knowledge. Gaussian processes (GPs) are highly desirable models for computer experiments for their predictive accuracy and uncertainty quantification. This dissertation addresses GP modeling when data abounds as well as GP adaptive design when simulator expense severely limits the amount of collected data. For data-rich problems, I introduce a localized sparse covariance GP that preserves the flexibility and predictive accuracy of a GP's predictive surface while saving computational time. This locally induced Gaussian process (LIGP) incorporates latent design points, inducing points, with a local Gaussian process built from a subset of the data. Various methods are introduced for the design of the inducing points. LIGP is then extended to adapt to stochastic data with replicates, estimating noise while relying upon the unique design locations for computation. I also address the goal of identifying a contour when data collection resources are limited through entropy-based adaptive design. Unlike existing methods, the entropy-based contour locator (ECL) adaptive design promotes exploration in the design space, performing well in higher dimensions and when the contour corresponds to a high/low quantile. ECL adaptive design can join with importance sampling for the purpose of reducing uncertainty in reliability estimation.en
dc.description.abstractgeneralDue to advancements in supercomputing and physics-based algorithms, today's computer simulation models often contain complex calculations that can produce larger amounts of data than through physical experiments. Computer experiments conducted with simulation models are sought-after ways to gather knowledge about physical problems but come with design and modeling challenges. In this dissertation, I address both data size extremes - building prediction models with large data sets and designing computer experiments when scarce resources limit the amount of data. For the former, I introduce a strategy of constructing a series of models including small subsets of observed data along with a set of unobserved data locations (inducing points). This methodology also contains the ability to perform calculations with only unique data locations when replicates exist in the data. The locally induced model produces accurate predictions while saving computing time. Various methods are introduced to decide the locations of these inducing points. The focus then shifts to designing an experiment for the purpose of accurate prediction around a particular output quantity of interest (contour). A experimental design approach is detailed that selects new sample locations one-at-a-time through a function to maximize the amount of information gain in the contour region for the overall model. This work is combined with an existing method to estimate the true volume of the contour.en
dc.description.degreeDoctor of Philosophyen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.subjectinducing pointsen
dc.subjectactive learningen
dc.subjectbig dataen
dc.titleEfficient computer experiment designs for Gaussian process surrogatesen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.nameDoctor of Philosophyen


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
4.4 MB
Adobe Portable Document Format