Scholarly Works, Statistics
Permanent URI for this collection
Research articles, presentations, and other scholarship
Browse
Browsing Scholarly Works, Statistics by Content Type "Article - Refereed"
Now showing 1 - 20 of 209
Results Per Page
Sort Options
- Accurate human microsatellite genotypes from high-throughput resequencing data using informed error profilesHighnam, Gareth; Franck, Christopher T.; Martin, Andy; Stephens, Calvin; Puthige, Ashwin; Mittelman, David (Oxford University Press, 2013-01)Repetitive sequences are biologically and clinically important because they can influence traits and disease, but repeats are challenging to analyse using short-read sequencing technology. We present a tool for genotyping microsatellite repeats called RepeatSeq, which uses Bayesian model selection guided by an empirically derived error model that incorporates sequence and read properties. Next, we apply RepeatSeq to high-coverage genomes from the 1000 Genomes Project to evaluate performance and accuracy. The software uses common formats, such as VCF, for compatibility with existing genome analysis pipelines. Source code and binaries are available at http://github.com/adaptivegenome/repeatseq.
- Age influences the thermal suitability of Plasmodium falciparum transmission in the Asian malaria vector Anopheles stephensiMiazgowicz, K. L.; Shocket, M. S.; Ryan, Sadie J.; Villena, O. C.; Hall, R. J.; Owen, J.; Adanlawo, T.; Balaji, K.; Johnson, Leah R.; Mordecai, Erin A.; Murdock, Courtney C. (2020-07-29)Models predicting disease transmission are vital tools for long-term planning of malaria reduction efforts, particularly for mitigating impacts of climate change. We compared temperature-dependent malaria transmission models when mosquito life-history traits were estimated from a truncated portion of the lifespan (a common practice) versus traits measured across the full lifespan. We conducted an experiment on adult femaleAnopheles stephensi, the Asian urban malaria mosquito, to generate daily per capita values for mortality, egg production and biting rate at six constant temperatures. Both temperature and age significantly affected trait values. Further, we found quantitative and qualitative differences between temperature-trait relationships estimated from truncated data versus observed lifetime values. Incorporating these temperature-trait relationships into an expression governing the thermal suitability of transmission, relativeR(0)(T), resulted in minor differences in the breadth of suitable temperatures forPlasmodium falciparumtransmission between the two models constructed from onlyAn. stephensitrait data. However, we found a substantial increase in thermal niche breadth compared with a previously published model consisting of trait data from multipleAnophelesmosquito species. Overall, this work highlights the importance of considering how mosquito trait values vary with mosquito age and mosquito species when generating temperature-based suitability predictions of transmission.
- Age-dependent ventilator-induced lung injury: Mathematical modeling, experimental data, and statistical analysisHay, Quintessa; Grubb, Christopher; Minucci, Sarah; Valentine, Michael S.; Van Mullekom, Jennifer H.; Heise, Rebecca L.; Reynolds, Angela M. (PLOS, 2024-02-22)A variety of pulmonary insults can prompt the need for life-saving mechanical ventilation; however, misuse, prolonged use, or an excessive inflammatory response, can result in ventilator-induced lung injury. Past research has observed an increased instance of respiratory distress in older patients and differences in the inflammatory response. To address this, we performed high pressure ventilation on young (2-3 months) and old (20-25 months) mice for 2 hours and collected data for macrophage phenotypes and lung tissue integrity. Large differences in macrophage activation at baseline and airspace enlargement after ventilation were observed in the old mice. The experimental data was used to determine plausible trajectories for a mathematical model of the inflammatory response to lung injury which includes variables for the innate inflammatory cells and mediators, epithelial cells in varying states, and repair mediators. Classification methods were used to identify influential parameters separating the parameter sets associated with the young or old data and separating the response to ventilation, which was measured by changes in the epithelial state variables. Classification methods ranked parameters involved in repair and damage to the epithelial cells and those associated with classically activated macrophages to be influential. Sensitivity results were used to determine candidate in-silico interventions and these interventions were most impact for transients associated with the old data, specifically those with poorer lung health prior to ventilation. Model results identified dynamics involved in M1 macrophages as a focus for further research, potentially driving the age-dependent differences in all macrophage phenotypes. The model also supported the pro-inflammatory response as a potential indicator of age-dependent differences in response to ventilation. This mathematical model can serve as a baseline model for incorporating other pulmonary injuries.
- Age-related variations in the methylome associated with gene expression in human monocytes and T cellsReynolds, Lindsay M.; Taylor, Jackson R.; Ding, Jingzhong; Lohman, Kurt; Johnson, Craig; Siscovick, David; Burke, Gregory L.; Post, Wendy; Shea, Steven; Jacobs, David R. Jr.; Stunnenberg, Hendrik G.; Kritchevsky, Stephen B.; Hoeschele, Ina; McCall, Charles E.; Herrington, David M.; Tracy, Russell P.; Liu, Yongmei (Springer Nature, 2014-11)Age-related variations in DNA methylation have been reported; however, the functional relevance of these differentially methylated sites (age-dMS) are unclear. Here we report potentially functional age-dMS, defined as age-and cis-gene expression-associated methylation sites (age-eMS), identified by integrating genome-wide CpG methylation and gene expression profiles collected ex vivo from circulating T cells (227 CD4+ samples) and monocytes (1,264 CD14+ samples, age range: 55-94 years). None of the age-eMS detected in 227 T-cell samples are detectable in 1,264 monocyte samples, in contrast to the majority of age-dMS detected in T cells that replicated in monocytes. Age-eMS tend to be hypomethylated with older age, located in predicted enhancers and preferentially linked to expression of antigen processing and presentation genes. These results identify and characterize potentially functional age-related methylation in human T cells and monocytes, and provide novel insights into the role age-dMS may have in the aging process.
- AHRQ series on complex intervention systematic reviews-paper 4: selecting analytic approachesViswanathan, Meera; McPheeters, Melissa L.; Hassan-Murad, M.; Butler, Mary E.; Devin, Emily E. (Beth); Dyson, Michele P.; Guise, Jeanne -Marie; Kahwati, Leila C.; Miles, Jeremy N. V.; Morton, Sally C. (2017-10)Background: Systematic reviews of complex interventions can vary widely in purpose, data availability and heterogeneity, and stakeholder expectations. Rationale: This article addresses the uncertainty that systematic reviewers face in selecting methods for reviews of complex interventions. Specifically, it lays out parameters for systematic reviewers to consider when selecting analytic approaches that best answer the questions at hand and suggests analytic techniques that may be appropriate in different circumstances. Discussion: Systematic reviews of complex interventions comprising multiple questions may use multiple analytic approaches. Parameters to consider when choosing analytic methods for complex interventions include nature and timing of the decision (clinical practice guideline, policy, or other); purpose of the review; extent of existing evidence; logistic factors such as the timeline, process, and resources for deciding the scope of the review; and value of information to be obtained from choosing specific systematic review methods. Reviewers may elect to revise their analytic approach based on new or changing considerations during the course of the review but should guard against bias through transparency of reporting. (C) 2017 The Authors. Published by Elsevier Inc.
- AHRQ series on complex intervention systematic reviews-paper 5: advanced analytic methodsPigott, Terri; Noyes, Jane; Umscheid, Craig A.; Myers, Evan; Morton, Sally C.; Fu, Rongwei; Sanders-Schmidler, Gillian D.; Devine, Beth; Murad, M. Hassan; Kelly, Michael P.; Fonnesbeck, Christopher; Kahwati, Leila C.; Beretvas, S. Natasha (2017-10)Background and Objective: Advanced analytic methods for synthesizing evidence about complex interventions continue to be developed. In this paper, we emphasize that the specific research question posed in the review should be used as a guide for choosing the appropriate analytic method. Methods: We present advanced analytic approaches that address four common questions that guide reviews of complex interventions: (1) How effective is the intervention? (2) For whom does the intervention work and in what contexts? (3) What happens when the intervention is implemented? and (4) What decisions are possible given the results of the synthesis? Conclusion: The analytic approaches presented in this paper are particularly useful when each primary study differs in components, mechanisms of action, context, implementation, timing, and many other domains. (C) 2017 The Author(s). Published by Elsevier Inc.
- Alterations in the molecular composition of COVID-19 patient urine, detected using Raman spectroscopic/computational analysisRobertson, John L.; Senger, Ryan S.; Talty, Janine; Du, Pang; Sayed-Issa, Amr; Avellar, Maggie L.; Ngo, Lacy T.; Gomez de la Espriella, Mariana; Fazili, Tasaduq N.; Jackson-Akers, Jasmine Y.; Guruli, Georgi; Orlando, Giuseppe (PLOS, 2022-07-01)We developed and tested a method to detect COVID-19 disease, using urine specimens. The technology is based on Raman spectroscopy and computational analysis. It does not detect SARS-CoV-2 virus or viral components, but rather a urine ‘molecular fingerprint’, representing systemic metabolic, inflammatory, and immunologic reactions to infection. We analyzed voided urine specimens from 46 symptomatic COVID-19 patients with positive real time-polymerase chain reaction (RT-PCR) tests for infection or household contact with test-positive patients. We compared their urine Raman spectra with urine Raman spectra from healthy individuals (n = 185), peritoneal dialysis patients (n = 20), and patients with active bladder cancer (n = 17), collected between 2016–2018 (i.e., pre-COVID-19). We also compared all urine Raman spectra with urine specimens collected from healthy, fully vaccinated volunteers (n = 19) from July to September 2021. Disease severity (primarily respiratory) ranged among mild (n = 25), moderate (n = 14), and severe (n = 7). Seventy percent of patients sought evaluation within 14 days of onset. One severely affected patient was hospitalized, the remainder being managed with home/ambulatory care. Twenty patients had clinical pathology profiling. Seven of 20 patients had mildly elevated serum creatinine values (>0.9 mg/dl; range 0.9–1.34 mg/dl) and 6/7 of these patients also had estimated glomerular filtration rates (eGFR) <90 mL/min/1.73m2 (range 59–84 mL/min/1.73m2). We could not determine if any of these patients had antecedent clinical pathology abnormalities. Our technology (Raman Chemometric Urinalysis—Rametrix®) had an overall prediction accuracy of 97.6% for detecting complex, multimolecular fingerprints in urine associated with COVID-19 disease. The sensitivity of this model for detecting COVID-19 was 90.9%. The specificity was 98.8%, the positive predictive value was 93.0%, and the negative predictive value was 98.4%. In assessing severity, the method showed to be accurate in identifying symptoms as mild, moderate, or severe (random chance = 33%) based on the urine multimolecular fingerprint. Finally, a fingerprint of ‘Long COVID-19’ symptoms (defined as lasting longer than 30 days) was located in urine. Our methods were able to locate the presence of this fingerprint with 70.0% sensitivity and 98.7% specificity in leave-one-out cross-validation analysis. Further validation testing will include sampling more patients, examining correlations of disease severity and/or duration, and employing metabolomic analysis (Gas Chromatography–Mass Spectrometry [GC-MS], High Performance Liquid Chromatography [HPLC]) to identify individual components contributing to COVID-19 molecular fingerprints.
- Alternative approaches for creating a wealth index: the case of MozambiqueXie, Kexin; Marathe, Achla; Deng, Xinwei; Ruiz-Castillo, Paula; Imputiua, Saimado; Elobolobo, Eldo; Mutepa, Victor; Sale, Mussa; Nicolas, Patricia; Montana, Julia; Jamisse, Edgar; Munguambe, Humberto; Materrula, Felisbela; Casellas, Aina; Rabinovich, Regina; Saute, Francisco; Chaccour, Carlos J.; Sacoor, Charfudin; Rist, Cassidy (BMJ, 2023-08)Introduction: The wealth index is widely used as a proxy for a household's socioeconomic position (SEP) and living standard. This work constructs a wealth index for the Mopeia district in Mozambique using data collected in year 2021 under the BOHEMIA (Broad One Health Endectocide-based Malaria Intervention in Africa) project. Methods: We evaluate the performance of three alternative approaches against the Demographic and Health Survey (DHS) method based wealth index: feature selection principal components analysis (PCA), sparse PCA and robust PCA. The internal coherence between four wealth indices is investigated through statistical testing. Validation and an evaluation of the stability of the wealth index are performed with additional household income data from the BOHEMIA Health Economics Survey and the 2018 Malaria Indicator Survey data in Mozambique. Results: The Spearman's rank correlation between wealth index ventiles from four methods is over 0.98, indicating a high consistency in results across methods. Wealth rankings and households' income show a strong concordance with the area under the curve value of ∼0.7 in the receiver operating characteristic analysis. The agreement between the alternative wealth indices and the DHS wealth index demonstrates the stability in rankings from the alternative methods. Conclusions: This study creates a wealth index for Mopeia, Mozambique, and shows that DHS method based wealth index is an appropriate proxy for the SEP in low-income regions. However, this research recommends feature selection PCA over the DHS method since it uses fewer asset indicators and constructs a high-quality wealth index.
- Antibiotic exposure is associated with decreased risk of psychiatric disordersKerman, Ilan A.; Glover, Matthew E.; Lin, Yezhe; West, Jennifer L.; Hanlon, Alexandra L.; Kablinger, Anita S.; Clinton, Sarah M. (Frontiers, 2024-01-08)Objective: This study sought to investigate the relationship between antibiotic exposure and subsequent risk of psychiatric disorders. Methods: This retrospective cohort study used a national database of 69 million patients from 54 large healthcare organizations. We identified a cohort of 20,214 (42.5% male; 57.9 ± 15.1 years old [mean ± SD]) adults without prior neuropsychiatric diagnoses who received antibiotics during hospitalization. Matched controls included 41,555 (39.6% male; 57.3 ± 15.5 years old) hospitalized adults without antibiotic exposure. The two cohorts were balanced for potential confounders, including demographics and variables with potential to affect: the microbiome, mental health, medical comorbidity, and overall health status. Data were stratified by age and by sex, and outcome measures were assessed starting 6 months after hospital discharge. Results: Antibiotic exposure was consistently associated with a significant decrease in the risk of novel mood disorders and anxiety and stressor-related disorders in: men (mood (OR 0.84, 95% CI 0.77, 0.91), anxiety (OR 0.88, 95% CI 0.82, 0.95), women (mood (OR 0.94, 95% CI 0.89,1.00), anxiety (OR 0.93, 95% CI 0.88, 0.98), those who are 26–49 years old (mood (OR 0.87, 95% CI 0.80, 0.94), anxiety (OR 0.90, 95% CI 0.84, 0.97)), and in those ≥50 years old (mood (OR 0.91, 95% CI 0.86, 0.97), anxiety (OR 0.92, 95% CI 0.87, 0.97). Risk of intentional harm and suicidality was decreased in men (OR 0.73, 95% CI 0.55, 0.98) and in those ≥50 years old (OR 0.67, 95% CI 0.49, 0.92). Risk of psychotic disorders was also decreased in subjects ≥50 years old (OR 0.83, 95 CI: 0.69, 0.99). Conclusion: Use of antibiotics in the inpatient setting is associated with protective effects against multiple psychiatric outcomes in an age- and sex-dependent manner.
- Applications of Different Weighting Schemes to Improve Pathway-Based AnalysisHa, Sook S.; Kim, Inyoung; Wang, Yue; Xuan, Jianhua (Hindawi, 2011-05-22)Conventionally, pathway-based analysis assumes that genes in a pathway equally contribute to a biological function, thus assigning uniform weight to genes. However, this assumption has been proved incorrect, and applying uniform weight in the pathway analysis may not be an appropriate approach for the tasks like molecular classification of diseases, as genes in a functional group may have different predicting power. Hence, we propose to use different weights to genes in pathway-based analysis and devise four weighting schemes. We applied them in two existing pathway analysis methods using both real and simulated gene expression data for pathways. Among all schemes, random weighting scheme, which generates random weights and selects optimal weights minimizing an objective function, performs best in terms of 𝑷 value or error rate reduction. Weighting changes pathway scoring and brings up some new significant pathways, leading to the detection of disease-related genes that are missed under uniform weight.
- Arm-specific dynamics of chromosome evolution in malaria mosquitoesSharakhova, Maria V.; Xia, Ai; Leman, Scotland C.; Sharakhov, Igor V. (Biomed Central, 2011-04-07)Background: The malaria mosquito species of subgenus Cellia have rich inversion polymorphisms that correlate with environmental variables. Polymorphic inversions tend to cluster on the chromosomal arms 2R and 2L but not on X, 3R and 3L in Anopheles gambiae and homologous arms in other species. However, it is unknown whether polymorphic inversions on homologous chromosomal arms of distantly related species from subgenus Cellia nonrandomly share similar sets of genes. It is also unclear if the evolutionary breakage of inversion-poor chromosomal arms is under constraints. Results: To gain a better understanding of the arm-specific differences in the rates of genome rearrangements, we compared gene orders and established syntenic relationships among Anopheles gambiae, Anopheles funestus, and Anopheles stephensi. We provided evidence that polymorphic inversions on the 2R arms in these three species nonrandomly captured similar sets of genes. This nonrandom distribution of genes was not only a result of preservation of ancestral gene order but also an outcome of extensive reshuffling of gene orders that created new combinations of homologous genes within independently originated polymorphic inversions. The statistical analysis of distribution of conserved gene orders demonstrated that the autosomal arms differ in their tolerance to generating evolutionary breakpoints. The fastest evolving 2R autosomal arm was enriched with gene blocks conserved between only a pair of species. In contrast, all identified syntenic blocks were preserved on the slowly evolving 3R arm of An. gambiae and on the homologous arms of An. funestus and An. stephensi. Conclusions: Our results suggest that natural selection favors specific gene combinations within polymorphic inversions when distant species are exposed to similar environmental pressures. This knowledge could be useful for the discovery of genes responsible for an association of inversion polymorphisms with phenotypic variations in multiple species. Our data support the chromosomal arm specificity in rates of gene order disruption during mosquito evolution. We conclude that the distribution of breakpoint regions is evolutionary conserved on slowly evolving arms and tends to be lineage-specific on rapidly evolving arms.
- Assessing Ecosystem State Space Models: Identifiability and EstimationSmith, John W.; Johnson, Leah R.; Thomas, R. Quinn (Springer, 2023-03)Hierarchical probability models are being used more often than non-hierarchical deterministic process models in environmental prediction and forecasting, and Bayesian approaches to fitting such models are becoming increasingly popular. In particular, models describing ecosystem dynamics with multiple states that are autoregressive at each step in time can be treated as statistical state space models (SSMs). In this paper, we examine this subset of ecosystem models, embed a process-based ecosystem model into an SSM, and give closed form Gibbs sampling updates for latent states and process precision parameters when process and observation errors are normally distributed. Here, we use simulated data from an example model (DALECev) and study the effects changing the temporal resolution of observations on the states (observation data gaps), the temporal resolution of the state process (model time step), and the level of aggregation of observations on fluxes (measurements of transfer rates on the state process). We show that parameter estimates become unreliable as temporal gaps between observed state data increase. To improve parameter estimates, we introduce a method of tuning the time resolution of the latent states while still using higher-frequency driver information and show that this helps to improve estimates. Further, we show that data cloning is a suitable method for assessing parameter identifiability in this class of models. Overall, our study helps inform the application of state space models to ecological forecasting applications where (1) data are not available for all states and transfers at the operational time step for the ecosystem model and (2) process uncertainty estimation is desired.
- Assessing the Early Aberration Reporting System's Ability to Locally Detect the 2009 Influenza PandemicHagen, K. S.; Fricker, Ronald D. Jr.; Hanni, K. D.; Barnes, S.; Michie, K. (2011-05)The Early Aberration Reporting System (EARS) is used by some local health departments (LHDs) to monitor emergency room and clinic data for disease outbreaks. Using actual chief complaint data from local public health clinics, we evaluate how EARS—both the baseline system distributed by the CDC and two variants implemented by one LHD—perform at locally detecting the 2009 influenza A H1N1 pandemic. We also compare the EARS methods to a CUSUM-based method. We find that the baseline EARS system performed poorly in comparison to one of the LHD variants and the CUSUM-based method. These results suggest that changes in how syndromes are defined can substantially improve EARS performance. The results also show that incorporating algorithms that use more historical data will improve EARS performance for routine surveillance by local health departments.
- Assessing Urban Landscape Variables’ Contributions to MicroclimatesParece, Tammy E.; Li, Jie; Campbell, James B. Jr.; Carroll, David F. (Hindawi, 2015-12-24)The well-known urban heat island (UHI) effect recognizes prevailing patterns of warmer urban temperatures relative to surrounding rural landscapes. Although UHIs are often visualized as single features, internal variations within urban landscapes create distinctive microclimates. Evaluating intraurban microclimate variability presents an opportunity to assess spatial dimensions of urban environments and identify locations that heat or cool faster than other locales. Our study employs mobile weather units and fixed weather stations to collect air temperatures across Roanoke, Virginia, USA, on selected dates over a two-year interval. Using this temperature data, together with six landscape variables, we interpolated (using Kriging and Random Forest) air temperatures across the city for each collection period. Our results estimated temperatures with small mean square errors (ranging from 0.03 to 0.14); landscape metrics explained between 60 and 91% of temperature variations (higher when the previous day’s average temperatures were included as a variable). For all days, similar spatial patterns appeared for cooler and warmer areas in mornings, with distinctive patterns as landscapes warmed during the day and over successive days. Our results revealed that the most potent landscape variables vary according to season and time of day. Our analysis contributes new dimensions and new levels of spatial and temporal detail to urban microclimate research.
- Assessing What Distinguishes Highly Cited from Less-Cited Papers Published in InterfacesFricker, Ronald D. Jr.; Hamrick, TA; Brown, G. G. (2010)
- Association testing for binary trees-A Markov branching process approachWu, Xiaowei; Zhu, Hongxiao (Wiley, 2022-03-09)We propose a new approach to test associations between binary trees and covariates. In this approach, binary-tree structured data are treated as sample paths of binary fission Markov branching processes (bMBP). We propose a generalized linear regression model and developed inference procedures for association testing, including variable selection and estimation of covariate effects. Simulation studies show that these procedures are able to accurately identify covariates that are associated with the binary tree structure by impacting the rate parameter of the bMBP. The problem of association testing on binary trees is motivated by modeling hierarchical clustering dendrograms of pixel intensities in biomedical images. By using semi-synthetic data generated from a real brain-tumor image, our simulation studies show that the bMBP model is able to capture the characteristics of dendrogram trees in brain-tumor images. Our final analysis of the glioblastoma multiforme brain-tumor data from The Cancer Imaging Archive identified multiple clinical and genetic variables that are potentially associated with brain-tumor heterogeneity.
- Attrition Models of the Ardennes CampaignFricker, Ronald D. Jr. (1998)
- A Bayesian Analysis of Copy Number Variations in Array Comparative Genomic Hybridization DataWu, Xiaowei; Zhu, Hongxiao (OMICS International, 2015-09-25)Array Comparative Genomic Hybridization (CGH) has been widely used for detecting genomic copy number variations (CNVs). The central goal of array CGH data analysis is to accurately detect homogeneous regions of log intensity ratios which represent relative changes in DNA copy number. Various methods have been proposed in recent years. Most methods, however, do not consider correlations of neighboring probe measurements, and are usually designed for analysis at single sample level rather than detecting common or recurrent CNVs among multiple samples. We propose a Bayesian segment-based approach for efficient analysis of array CGH data. The proposed method is based on simple assumptions but is general enough to accommodate various spatial correlations among probe measurements. It also allows for multiple samples with recurrent CNVs, therefore is able to borrow strength across samples. In contrast to another probe-based approach developed in the same Bayesian framework, the segment-based approach parameterizes the mean log intensity ratios in a more appropriate way, which leads to a posterior sampling scheme based on reversible-jump Markov chain Monte Carlo. We perform a simulation study to compare these two approaches and the commonly-used circular binary segmentation method and Bayesian hidden Markov model method. The segment-based approach achieves better estimation accuracy and higher computational efficiency compared to the probe-based approach, and also provides improved results compared to the other two methods, especially for data with relatively low signal to noise ratio and high correlation. The segment-based approach is further applied to the Corriel cell lines data and Pancreatic Adenocarcinoma data.
- Bayesian Dynamical Systems Modelling in the Social SciencesRanganathan, Shyam; Spaiser, Viktoria; Mann, Richard P.; Sumpter, David J. T. (PLOS, 2014-01-20)Data arising from social systems is often highly complex, involving non-linear relationships between the macro-level variables that characterize these systems. We present a method for analyzing this type of longitudinal or panel data using differential equations. We identify the best non-linear functions that capture interactions between variables, employing Bayes factor to decide how many interaction terms should be included in the model. This method punishes overly complicated models and identifies models with the most explanatory power. We illustrate our approach on the classic example of relating democracy and economic growth, identifying non-linear relationships between these two variables. We show how multiple variables and variable lags can be accounted for and provide a toolbox in R to implement our approach.
- Bayesian Graphical Models for Multivariate Functional DataZhu, Hongxiao; Strawn, Nate; Dunson, David B. (2016-11-28)Graphical models express conditional independence relationships among variables. Although methods for vector-valued data are well established, functional data graphical models remain underdeveloped. By functional data, we refer to data that are realizations of random functions varying over a continuum (e.g., images, signals). We introduce a notion of conditional independence between random functions, and construct a framework for Bayesian inference of undirected, decomposable graphs in the multivariate functional data context. This framework is based on extending Markov distributions and hyper Markov laws from random variables to random processes, providing a principled alternative to naive application of multivariate methods to discretized functional data. Markov properties facilitate the composition of likelihoods and priors according to the decomposition of a graph. Our focus is on Gaussian process graphical models using orthogonal basis expansions. We propose a hyper-inverse-Wishart-process prior for the covariance kernels of the infinite coeficient sequences of the basis expansion, and establish its existence and uniqueness. We also prove the strong hyper Markov property and the conjugacy of this prior under a finite rank condition of the prior kernel parameter. Stochastic search Markov chain Monte Carlo algorithms are developed for posterior inference, assessed through simulations, and applied to a study of brain activity and alcoholism.