Browsing by Author "Ferreira, Marco A. R."
Now showing 1 - 20 of 20
Results Per Page
Sort Options
- Applying an Intrinsic Conditional Autoregressive Reference Prior for Areal DataPorter, Erica May (Virginia Tech, 2019-07-09)Bayesian hierarchical models are useful for modeling spatial data because they have flexibility to accommodate complicated dependencies that are common to spatial data. In particular, intrinsic conditional autoregressive (ICAR) models are commonly assigned as priors for spatial random effects in hierarchical models for areal data corresponding to spatial partitions of a region. However, selection of prior distributions for these spatial parameters presents a challenge to researchers. We present and describe ref.ICAR, an R package that implements an objective Bayes intrinsic conditional autoregressive prior on a vector of spatial random effects. This model provides an objective Bayesian approach for modeling spatially correlated areal data. ref.ICAR enables analysis of spatial areal data for a specified region, given user-provided data and information about the structure of the study region. The ref.ICAR package performs Markov Chain Monte Carlo (MCMC) sampling and outputs posterior medians, intervals, and trace plots for fixed effect and spatial parameters. Finally, the functions provide regional summaries, including medians and credible intervals for fitted values by subregion.
- Bayesian Analysis of Temporal and Spatio-temporal Multivariate Environmental DataEl Khouly, Mohamed Ibrahim (Virginia Tech, 2019-05-09)High dimensional space-time datasets are available nowadays in various aspects of life such as economy, agriculture, health, environment, etc. Meanwhile, it is challenging to reveal possible connections between climate change and weather extreme events such as hurricanes or tornadoes. In particular, the relationship between tornado occurrence and climate change has remained elusive. Moreover, modeling multivariate spatio-temporal data is computationally expensive. There is great need to computationally feasible models that account for temporal, spatial, and inter-variables dependence. Our research focuses on those areas in two ways. First, we investigate connections between changes in tornado risk and the increase in atmospheric instability over Oklahoma. Second, we propose two multiscale spatio-temporal models, one for multivariate Gaussian data, and the other for matrix-variate Gaussian data. Those frameworks are novel additions to the existing literature on Bayesian multiscale models. In addition, we have proposed parallelizable MCMC algorithms to sample from the posterior distributions of the model parameters with enhanced computations.
- Bayesian Model Selection for Spatial Data and Cost-constrained ApplicationsPorter, Erica May (Virginia Tech, 2023-07-03)Bayesian model selection is a useful tool for identifying an appropriate model class, dependence structure, and valuable predictors for a wide variety of applications. In this work we consider objective Bayesian model selection where no subjective information is available to inform priors on model parameters a priori, specifically in the case of hierarchical models for spatial data, which can have complex dependence structures. We develop an approach using trained priors via fractional Bayes factors where standard Bayesian model selection methods fail to produce valid probabilities under improper reference priors. This enables researchers to concurrently determine whether spatial dependence between observations is apparent and identify important predictors for modeling the response. In addition to model selection with objective priors on model parameters, we also consider the case where the priors on the model space are used to penalize individual predictors a priori based on their costs. We propose a flexible approach that introduces a tuning parameter to cost-penalizing model priors that allows researchers to control the level of cost penalization to meet budget constraints and accommodate increasing sample sizes.
- Bayesian Uncertainty Quantification while Leveraging Multiple Computer Model RunsWalsh, Stephen A. (Virginia Tech, 2023-06-22)In the face of spatially correlated data, Gaussian process regression is a very common modeling approach. Given observational data, kriging equations will provide the best linear unbiased predictor for the mean at unobserved locations. However, when a computer model provides a complete grid of forecasted values, kriging will not apply. To develop an approach to quantify uncertainty of computer model output in this setting, we leverage information from a collection of computer model runs (e.g., historical forecast and observation pairs for tropical cyclone precipitation totals) through a Bayesian hierarchical framework. This framework allows us to combine information and account for the spatial correlation within and across computer model output. Using maximum likelihood estimates and the corresponding Hessian matrices for Gaussian process parameters, these are input to a Gibbs sampler which provides posterior distributions for parameters of interest. These samples are used to generate predictions which provide uncertainty quantification for a given computer model run (e.g., tropical cyclone precipitation forecast). We then extend this framework using deep Gaussian processes to allow for nonstationary covariance structure, applied to multiple computer model runs from a cosmology application. We also perform sensitivity analyses to understand which parameter inputs most greatly impact cosmological computer model output.
- Bayesian variable selection for linear mixed models when p is much larger than n with applications in genome wide association studiesWilliams, Jacob Robert Michael (Virginia Tech, 2023-06-05)Genome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNP) causing phenotypic responses in individuals. Commonly, GWAS analyses are done by using single marker association testing (SMA) which investigates the effect of a single SNP at a time and selects a candidate set of SNPs using a strict multiple correction penalty. As SNPs are not independent but instead strongly correlated, SMA methods lead to such high false discovery rates (FDR) that the results are difficult to use by wet lab scientists. To address this, this dissertation proposes three different novel Bayesian methods: BICOSS, BGWAS, and IEB. From a Bayesian modeling point of view, SNP search can be seen as a variable selection problem in linear mixed models (LMMs) where $p$ is much larger than $n$. To deal with the $p>>n$ issue, our three proposed methods use novel Bayesian approaches based on two steps: a screening step and a model selection step. To control false discoveries, we link the screening and model selection steps through a common probability of a null SNP. To deal with model selection, we propose novel priors that are extensions for LMMs of nonlocal priors, Zellner-g prior, unit Information prior, and Zellner-Siow prior. For each method, extensive simulation studies and case studies show that these methods improve the recall of true causal SNPs and, more importantly, drastically decrease FDR. Because our Bayesian methods provide more focused and precise results, they may speed up discovery of important SNPs and significantly contribute to scientific progress in the areas of biology, agricultural productivity, and human health.
- BG2: Bayesian variable selection in generalized linear mixed models with nonlocal priors for non-Gaussian GWAS dataXu, Shuangshuang; Williams, Jacob; Ferreira, Marco A. R. (2023-09-15)Background Genome-wide association studies (GWASes) aim to identify single nucleotide polymorphisms (SNPs) associated with a given phenotype. A common approach for the analysis of GWAS is single marker analysis (SMA) based on linear mixed models (LMMs). However, LMM-based SMA usually yields a large number of false discoveries and cannot be directly applied to non-Gaussian phenotypes such as count data. Results We present a novel Bayesian method to find SNPs associated with non-Gaussian phenotypes. To that end, we use generalized linear mixed models (GLMMs) and, thus, call our method Bayesian GLMMs for GWAS (BG2). To deal with the high dimensionality of GWAS analysis, we propose novel nonlocal priors specifically tailored for GLMMs. In addition, we develop related fast approximate Bayesian computations. BG2 uses a two-step procedure: first, BG2 screens for candidate SNPs; second, BG2 performs model selection that considers all screened candidate SNPs as possible regressors. A simulation study shows favorable performance of BG2 when compared to GLMM-based SMA. We illustrate the usefulness and flexibility of BG2 with three case studies on cocaine dependence (binary data), alcohol consumption (count data), and number of root-like structures in a model plant (count data).
- BGWAS: Bayesian variable selection in linear mixed models with nonlocal priors for genome-wide association studiesWilliams, Jacob; Xu, Shuangshuang; Ferreira, Marco A. R. (2023-05-11)Background Genome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNPs) that cause observed phenotypes. However, with highly correlated SNPs, correlated observations, and the number of SNPs being two orders of magnitude larger than the number of observations, GWAS procedures often suffer from high false positive rates. Results We propose BGWAS, a novel Bayesian variable selection method based on nonlocal priors for linear mixed models specifically tailored for genome-wide association studies. Our proposed method BGWAS uses a novel nonlocal prior for linear mixed models (LMMs). BGWAS has two steps: screening and model selection. The screening step scans through all the SNPs fitting one LMM for each SNP and then uses Bayesian false discovery control to select a set of candidate SNPs. After that, a model selection step searches through the space of LMMs that may have any number of SNPs from the candidate set. A simulation study shows that, when compared to popular GWAS procedures, BGWAS greatly reduces false positives while maintaining the same ability to detect true positive SNPs. We show the utility and flexibility of BGWAS with two case studies: a case study on salt stress in plants, and a case study on alcohol use disorder. Conclusions BGWAS maintains and in some cases increases the recall of true SNPs while drastically lowering the number of false positives compared to popular SMA procedures.
- BICOSS: Bayesian iterative conditional stochastic search for GWASWilliams, Jacob; Ferreira, Marco A. R.; Ji, Tieming (2022-11-12)Background Single marker analysis (SMA) with linear mixed models for genome wide association studies has uncovered the contribution of genetic variants to many observed phenotypes. However, SMA has weak false discovery control. In addition, when a few variants have large effect sizes, SMA has low statistical power to detect small and medium effect sizes, leading to low recall of true causal single nucleotide polymorphisms (SNPs). Results We present the Bayesian Iterative Conditional Stochastic Search (BICOSS) method that controls false discovery rate and increases recall of variants with small and medium effect sizes. BICOSS iterates between a screening step and a Bayesian model selection step. A simulation study shows that, when compared to SMA, BICOSS dramatically reduces false discovery rate and allows for smaller effect sizes to be discovered. Finally, two real world applications show the utility and flexibility of BICOSS. Conclusions When compared to widely used SMA, BICOSS provides higher recall of true SNPs while dramatically reducing false discovery rate.
- Deep Gaussian Process Surrogates for Computer ExperimentsSauer, Annie Elizabeth (Virginia Tech, 2023-04-27)Deep Gaussian processes (DGPs) upgrade ordinary GPs through functional composition, in which intermediate GP layers warp the original inputs, providing flexibility to model non-stationary dynamics. Recent applications in machine learning favor approximate, optimization-based inference for fast predictions, but applications to computer surrogate modeling - with an eye towards downstream tasks like Bayesian optimization and reliability analysis - demand broader uncertainty quantification (UQ). I prioritize UQ through full posterior integration in a Bayesian scheme, hinging on elliptical slice sampling of latent layers. I demonstrate how my DGP's non-stationary flexibility, combined with appropriate UQ, allows for active learning: a virtuous cycle of data acquisition and model updating that departs from traditional space-filling designs and yields more accurate surrogates for fixed simulation effort. I propose new sequential design schemes that rely on optimization of acquisition criteria through evaluation of strategically allocated candidates instead of numerical optimizations, with a motivating application to contour location in an aeronautics simulation. Alternatively, when simulation runs are cheap and readily available, large datasets present a challenge for full DGP posterior integration due to cubic scaling bottlenecks. For this case I introduce the Vecchia approximation, popular for ordinary GPs in spatial data settings. I show that Vecchia-induced sparsity of Cholesky factors allows for linear computational scaling without compromising DGP accuracy or UQ. I vet both active learning and Vecchia-approximated DGPs on numerous illustrative examples and real computer experiments. I provide open-source implementations in the "deepgp" package for R on CRAN.
- Detection of Latent Heteroscedasticity and Group-Based Regression Effects in Linear Models via Bayesian Model SelectionMetzger, Thomas Anthony (Virginia Tech, 2019-08-22)Standard linear modeling approaches make potentially simplistic assumptions regarding the structure of categorical effects that may obfuscate more complex relationships governing data. For example, recent work focused on the two-way unreplicated layout has shown that hidden groupings among the levels of one categorical predictor frequently interact with the ungrouped factor. We extend the notion of a "latent grouping factor'' to linear models in general. The proposed work allows researchers to determine whether an apparent grouping of the levels of a categorical predictor reveals a plausible hidden structure given the observed data. Specifically, we offer Bayesian model selection-based approaches to reveal latent group-based heteroscedasticity, regression effects, and/or interactions. Failure to account for such structures can produce misleading conclusions. Since the presence of latent group structures is frequently unknown a priori to the researcher, we use fractional Bayes factor methods and mixture g-priors to overcome lack of prior information. We provide an R package, slgf, that implements our methodology in practice, and demonstrate its usage in practice.
- Long term temporal trends in synoptic-scale weather conditions favoring significant tornado occurrence over the central United StatesElkhouly, Mohamed; Zick, Stephanie E.; Ferreira, Marco A. R. (PLOS, 2023-02-22)We perform a statistical climatological study of the synoptic- to meso-scale weather conditions favoring significant tornado occurrence to empirically investigate the existence of long term temporal trends. To identify environments that favor tornadoes, we apply an empirical orthogonal function (EOF) analysis to temperature, relative humidity, and winds from the Modern-Era Retrospective analysis for Research and Applications Version 2 (MERRA-2) dataset. We consider MERRA-2 data and tornado data from 1980 to 2017 over four adjacent study regions that span the Central, Midwestern, and Southeastern United States. To identify which EOFs are related to significant tornado occurrence, we fit two separate groups of logistic regression models. The first group (LEOF models) estimates the probability of occurrence of a significant tornado day (EF2-EF5) within each region. The second group (IEOF models) classifies the intensity of tornadic days either as strong (EF3-EF5) or weak (EF1-EF2). When compared to approaches using proxies such as convective available potential energy, our EOF approach is advantageous for two main reasons: first, the EOF approach allows for the discovery of important synoptic- to mesoscale variables previously not considered in the tornado science literature; second, proxy-based analyses may not capture important aspects of three-dimensional atmospheric conditions represented by the EOFs. Indeed, one of our main novel findings is the importance of a stratospheric forcing mode on occurrence of significant tornadoes. Other important novel findings are the existence of long-term temporal trends in the stratospheric forcing mode, in a dry line mode, and in an ageostrophic circulation mode related to the jet stream configuration. A relative risk analysis also indicates that changes in stratospheric forcings are partially or completely offsetting increased tornado risk associated with the dry line mode, except in the eastern Midwest region where tornado risk is increasing.
- Long-term recovery from opioid use disorder: recovery subgroups, transition states and their association with substance use, treatment and quality of lifeCraft, William H.; Shin, Hwasoo; Tegge, Allison N.; Keith, Diana R.; Athamneh, Liqa N.; Stein, Jeffrey S.; Ferreira, Marco A. R.; Chilcoat, Howard D.; Le Moigne, Anne; DeVeaugh-Geiss, Angela; Bickel, Warren K. (Wiley, 2022-12)Background and AimsLimited information exists regarding individual subgroups of recovery from opioid use disorder (OUD) following treatment and how these subgroups may relate to recovery trajectories. We used multi-dimensional criteria to identify OUD recovery subgroups and longitudinal transitions across subgroups. Design, Setting and ParticipantsIn a national longitudinal observational study in the United States, individuals who previously participated in a clinical trial for subcutaneous buprenorphine injections for treatment of OUD were enrolled and followed for an average of 4.2 years after participation in the clinical trial. MeasurementsWe identified recovery subgroups based on psychosocial outcomes including depression, opioid withdrawal and pain. We compared opioid use, treatment utilization and quality of life among these subgroups. FindingsThree dimensions of the recovery process were identified: depression, opioid withdrawal and pain. Using these three dimensions, participants were classified into four recovery subgroups: high-functioning (minimal depression, mild withdrawal and no/mild pain), pain/physical health (minimal depression, mild withdrawal and moderate pain), depression (moderate depression, mild withdrawal and mild/moderate pain) and low-functioning (moderate/severe withdrawal, moderate depression and moderate/severe pain). Significant differences among subgroups were observed for DSM-5 criteria (P < 0.001) and remission status (P < 0.001), as well as with opioid use (P < 0.001), treatment utilization (P < 0.001) and quality of life domains (physical health, psychological, environment and social relationships; Ps < 0.001, Cohen's fs >= 0.62). Recovery subgroup assignments were dynamic, with individuals transitioning across subgroups during the observational period. Moreover, the initial recovery subgroup assignment was minimally predictive of long-term outcomes. ConclusionsThere appear to be four distinct subgroups among individuals in recovery from OUD. Recovery subgroup assignments are dynamic and predictive of contemporaneous, but not long-term, substance use, substance use treatment utilization or quality of life outcomes.
- Mapping Genetic Variation in Arabidopsis in Response to Plant Growth-Promoting Bacterium Azoarcus olearius DQS-4TPlucani do Amaral, Fernanda; Wang, Juexin; Williams, Jacob; Tuleski, Thalita R.; Joshi, Trupti; Ferreira, Marco A. R.; Stacey, Gary (MDPI, 2023-01-28)Plant growth-promoting bacteria (PGPB) can enhance plant health by facilitating nutrient uptake, nitrogen fixation, protection from pathogens, stress tolerance and/or boosting plant productivity. The genetic determinants that drive the plant–bacteria association remain understudied. To identify genetic loci highly correlated with traits responsive to PGPB, we performed a genome-wide association study (GWAS) using an Arabidopsis thaliana population treated with Azoarcus olearius DQS-4T. Phenotypically, the 305 Arabidopsis accessions tested responded differently to bacterial treatment by improving, inhibiting, or not affecting root system or shoot traits. GWA mapping analysis identified several predicted loci associated with primary root length or root fresh weight. Two statistical analyses were performed to narrow down potential gene candidates followed by haplotype block analysis, resulting in the identification of 11 loci associated with the responsiveness of Arabidopsis root fresh weight to bacterial inoculation. Our results showed considerable variation in the ability of plants to respond to inoculation by A. olearius DQS-4T while revealing considerable complexity regarding statistically associated loci with the growth traits measured. This investigation is a promising starting point for sustainable breeding strategies for future cropping practices that may employ beneficial microbes and/or modifications of the root microbiome.
- Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression modelXie, Jing; Ji, Tieming; Ferreira, Marco A. R.; Li, Yahan; Patel, Bhaumik N.; Rivera, Rocio M. (2019-10-28)Background High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression. Specifically, existing methods do not test allele-specific expression (ASE) of a gene as a whole and variation in ASE within a gene across exons separately and simultaneously. Results We propose a generalized linear mixed model to close these gaps, incorporating variations due to genes, single nucleotide polymorphisms (SNPs), and biological replicates. To improve reliability of statistical inferences, we assign priors on each effect in the model so that information is shared across genes in the entire genome. We utilize Bayesian model selection to test the hypothesis of ASE for each gene and variations across SNPs within a gene. We apply our method to four tissue types in a bovine study to de novo detect ASE genes in the bovine genome, and uncover intriguing predictions of regulatory ASEs across gene exons and across tissue types. We compared our method to competing approaches through simulation studies that mimicked the real datasets. The R package, BLMRM, that implements our proposed algorithm, is publicly available for download at https://github.com/JingXieMIZZOU/BLMRM Conclusions We will show that the proposed method exhibits improved control of the false discovery rate and improved power over existing methods when SNP variation and biological variation are present. Besides, our method also maintains low computational requirements that allows for whole genome analysis.
- Musashi1 Contribution to Glioblastoma Development via Regulation of a Network of DNA Replication, Cell Cycle and Division GenesBaroni, Mirella; Yi, Caihong; Choudhary, Saket; Lei, Xiufen; Kosti, Adam; Grieshober, Denise; Velasco, Mitzli; Qiao, Mei; Burns, Suzanne S.; Araujo, Patricia R.; DeLambre, Talia; Son, Mi Young; Plateroti, Michelina; Ferreira, Marco A. R.; Hasty, Paul; Penalva, Luiz O. F. (MDPI, 2021-03-24)RNA-binding proteins (RBPs) function as master regulators of gene expression. Alterations in their levels are often observed in tumors with numerous oncogenic RBPs identified in recent years. Musashi1 (Msi1) is an RBP and stem cell gene that controls the balance between self-renewal and differentiation. High Msi1 levels have been observed in multiple tumors including glioblastoma and are often associated with poor patient outcomes and tumor growth. A comprehensive genomic analysis identified a network of cell cycle/division and DNA replication genes and established these processes as Msi1’s core regulatory functions in glioblastoma. Msi1 controls this gene network via two mechanisms: direct interaction and indirect regulation mediated by the transcription factors E2F2 and E2F8. Moreover, glioblastoma lines with Msi1 knockout (KO) displayed increased sensitivity to cell cycle and DNA replication inhibitors. Our results suggest that a drug combination strategy (Msi1 + cell cycle/DNA replication inhibitors) could be a viable route to treat glioblastoma.
- Objective Bayesian Analysis for Gaussian Hierarchical Models with Intrinsic Conditional Autoregressive PriorsKeefe, Matthew J.; Ferreira, Marco A. R.; Franck, Christopher T. (2019-03)Bayesian hierarchical models are commonly used for modeling spatially correlated areal data. However, choosing appropriate prior distributions for the parameters in these models is necessary and sometimes challenging. In particular, an intrinsic conditional autoregressive (CAR) hierarchical component is often used to account for spatial association. Vague proper prior distributions have frequently been used for this type of model, but this requires the careful selection of suitable hyperparameters. In this paper, we derive several objective priors for the Gaussian hierarchical model with an intrinsic CAR component and discuss their properties. We show that the independence Jeffreys and Jeffreys-rule priors result in improper posterior distributions, while the reference prior results in a proper posterior distribution. We present results from a simulation study that compares frequentist properties of Bayesian procedures that use several competing priors, including the derived reference prior. We demonstrate that using the reference prior results in favorable coverage, interval length, and mean squared error. Finally, we illustrate our methodology with an application to 2012 housing foreclosure rates in the 88 counties of Ohio.
- On a Selection of Advanced Markov Chain Monte Carlo Algorithms for Everyday Use: Weighted Particle Tempering, Practical Reversible Jump, and ExtensionsCarzolio, Marcos Arantes (Virginia Tech, 2016-07-08)We are entering an exciting era, rich in the availability of data via sources such as the Internet, satellites, particle colliders, telecommunication networks, computer simulations, and the like. The confluence of increasing computational resources, volumes of data, and variety of statistical procedures has brought us to a modern enlightenment. Within the next century, these tools will combine to reveal unforeseeable insights into the social and natural sciences. Perhaps the largest headwind we now face is our collectively slow-moving imagination. Like a car on an open road, learning is limited by its own rate. Historically, slow information dissemination and the unavailability of experimental resources limited our learning. To that point, any methodological contribution that helps in the conversion of data into knowledge will accelerate us along this open road. Furthermore, if that contribution is accessible to others, the speedup in knowledge discovery scales exponentially. Markov chain Monte Carlo (MCMC) is a broad class of powerful algorithms, typically used for Bayesian inference. Despite their variety and versatility, these algorithms rarely become mainstream workhorses because they can be difficult to implement. The humble goal of this work is to bring to the table a few more highly versatile and robust, yet easily-tuned algorithms. Specifically, we introduce weighted particle tempering, a parallelizable MCMC procedure that is adaptable to large computational resources. We also explore and develop a highly practical implementation of reversible jump, the most generalized form of MetropolisHastings. Finally, we combine these two algorithms into reversible jump weighted particle tempering, and apply it on a model and dataset that was partially collected by the author and his collaborators, halfway around the world. It is our hope that by introducing, developing, and exhibiting these algorithms, we can make a reasonable contribution to the ever-growing body of MCMC research.
- Predictive Model Fusion: A Modular Approach to Big, Unstructured DataHoegh, Andrew B. (Virginia Tech, 2016-05-05)Data sets of increasing size and complexity require new approaches for prediction as the sheer volume of data from disparate sources inhibits joint processing and modeling. Rather modular segmentation is required, in which a set of models process (potentially overlapping) partitions of the data to independently construct predictions. This framework enables individuals models to be tailored for specific selective superiorities without concern for existing models, which provides utility in cases of segmented expertise. However, a method for fusing predictions from the collection of models is required as models may be correlated. This work details optimal principles for fusing binary predictions from a collection of models to issue a joint prediction. An efficient algorithm is introduced and compared with off the shelf methods for binary prediction. This framework is then implemented in an applied setting to predict instances of civil unrest in Central and South America. Finally, model fusion principles of a spatiotemporal nature are developed to predict civil unrest. A novel multiscale modeling is used for efficient, scalable computation for combining a set of spatiotemporal predictions.
- Some Advances in Local Approximate Gaussian ProcessesSun, Furong (Virginia Tech, 2019-10-03)Nowadays, Gaussian Process (GP) has been recognized as an indispensable statistical tool in computer experiments. Due to its computational complexity and storage demand, its application in real-world problems, especially in "big data" settings, is quite limited. Among many strategies to tailor GP to such settings, Gramacy and Apley (2015) proposed local approximate GP (laGP), which constructs approximate predictive equations by constructing small local designs around the predictive location under certain criterion. In this dissertation, several methodological extensions based upon laGP are proposed. One methodological contribution is the multilevel global/local modeling, which deploys global hyper-parameter estimates to perform local prediction. The second contribution comes from extending the laGP notion of "locale" to a set of predictive locations, along paths in the input space. These two contributions have been applied in the satellite drag emulation, which is illustrated in Chapter 3. Furthermore, the multilevel GP modeling strategy has also been applied to synthesize field data and computer model outputs of solar irradiance across the continental United States, combined with inverse-variance weighting, which is detailed in Chapter 4. Last but not least, in Chapter 5, laGP's performance has been tested on emulating daytime land surface temperatures estimated via satellites, in the settings of irregular grid locations.
- Statistical Monitoring and Modeling for Spatial ProcessesKeefe, Matthew James (Virginia Tech, 2017-03-17)Statistical process monitoring and hierarchical Bayesian modeling are two ways to learn more about processes of interest. In this work, we consider two main components: risk-adjusted monitoring and Bayesian hierarchical models for spatial data. Usually, if prior information about a process is known, it is important to incorporate this into the monitoring scheme. For example, when monitoring 30-day mortality rates after surgery, the pre-operative risk of patients based on health characteristics is often an indicator of how likely the surgery is to succeed. In these cases, risk-adjusted monitoring techniques are used. In this work, the practical limitations of the traditional implementation of risk-adjusted monitoring methods are discussed and an improved implementation is proposed. A method to perform spatial risk-adjustment based on exact locations of concurrent observations to account for spatial dependence is also described. Furthermore, the development of objective priors for fully Bayesian hierarchical models for areal data is explored for Gaussian responses. Collectively, these statistical methods serve as analytic tools to better monitor and model spatial processes.