Scholarly Works, Statistics

Permanent URI for this collection

https://hdl.handle.net/10919/24334

Research articles, presentations, and other scholarship

Browse

Now showing 1 - 20 of 300

Synthesizing data products, mathematical models, and observational measurements for lake temperature forecasting
Holthuijzen, Maike; Gramacy, Robert; Carey, Cayelan; Higdon, Dave; Thomas, R. Quinn (2025-06-01)
We present a novel forecasting framework for lake water temperature, which is crucial for managing lake ecosystems and drinking water resources. The General Lake Model (GLM) has been previously used for this purpose, but, similar to many process-based simulation models, it: requires a large number of inputs, many of which are stochastic; presents challenges for uncertainty quantification (UQ); and can exhibit model bias. To address these issues, we propose a Gaussian process (GP) surrogate-based forecasting approach that efficiently handles large, high-dimensional data and accounts for input-dependent variability and systematic GLM bias. We validate the proposed approach and compare it with other forecasting methods, including a climatological model and raw GLM simulations. Our results demonstrate that our bias-corrected GP surrogate (GPBC) can outperform competing approaches in terms of forecast accuracy and UQ up to two weeks into the future.
Deep-Neural-Network-Aided Genetic Association Testing in Samples with Related Individuals
Wu, Xiaowei (MDPI, 2026-03-04)
Genome-wide association studies (GWAS) have successfully identified thousands of genetic loci associated with complex traits and diseases, providing critical insights into genetic architecture, biological pathways, and disease mechanisms. With the advance of machine learning, the analytical scope of GWAS can be substantially expanded by enabling joint modeling, nonlinear effects, and integrative analysis. However, deep learning approaches remain underutilized in augmenting traditional GWAS frameworks, particularly in the presence of cryptic relatedness among sampled individuals. In this paper, we propose a deep neural network (DNN)-based machine learning method to assist genetic association testing in samples with related individuals. By approximating the phenotype–genotype relationships in classical association tests and combining approximations across multiple tests, the proposed method aims to improve predictive performance in the identification of associated variants. Simulation studies demonstrate that our approach effectively complements conventional statistical methods and generally achieves increased power for detecting genetic associations. We further apply the method to data from the Framingham Heart Study, illustrating how DNN-based machine learning can facilitate the identification of genome-wide SNPs associated with average systolic blood pressure.
Robust Estimation and Inference for Semiparametric and Nonparametric Regression Models
Mahmoud, Hamdy F. F.; Ali, Ahmed AbdelWahab A.; Mohamed, Wael Mahmoud A. (MDPI, 2026-03-11)
Parametric regression methods are efficient when correctly specified but are sensitive to model misspecification and outliers. Nonparametric regression offers greater flexibility at the cost of reduced interpretability and susceptibility to the curse of dimensionality. Semiparametric models provide a compromise between these approaches by combining structural interpretability with functional flexibility. A key limitation of many classical semiparametric and nonparametric methods, however, is their lack of robustness to heavy-tailed errors and contaminated data. In this paper, we develop robust kernel, spline, and single-index regression estimators based on robust loss functions. To facilitate inference, we propose bootstrap-based procedures that remain valid in settings where classical assumptions may be violated. Through extensive simulation studies under normal, heavy-tailed, and contaminated error distributions, we demonstrate that the proposed robust methods achieve comparable performance to classical approaches in clean settings while providing substantial gains in stability and inferential reliability under contamination. Unlike existing works that study these robust estimators in isolation, the proposed approach provides a unified framework that integrates robust kernel regression, robust spline regression, and robust single-index modeling with a coherent bootstrap-based inference procedure. Application to Boston housing data further illustrates the practical usefulness of the proposed methodology.
Homework Software Access Code Replacements and Strategies: A Roundtable Discussion
Walz, Anita R.; Russell, J. Morgan; Hart, Heath David; Lord, James K.; Grohs, Jacob R. (2025-02-13)
Homework software systems save time for instructors, particularly in large-enrollment courses. However, student-paid access codes have limited functionality and are expensive--between $50-150 per course per semester for the 30% of courses which require them. Functionality affects learning and costs disproportionally affect historically underserved students and student academic performance. Virginia Tech’s Open Education Initiative is working to establish a variety of options for instructors. Join this Roundtable to discuss with instructors from STEM and non-STEM disciplines who use university approved, no-fee-to-students alternatives including: WeBWorK, PressbooksResults, peer-reviewed test banks for LMS import, and problem set environment for engineering. Downloadable files include slides and submitted proposal.
Impact of obesity on the perinatal vaginal environment and bacterial microbiome: effects on birth outcomes
Ingram, Kelly; Eko, Embelle Ngalame; Nunziato, Jaclyn; Ahrens, Monica; Howell, Brittany (Microbiology Society, 2024-08)
Introduction. Lactobacillus species predominate the human vagina and are associated with positive vaginal health, including an acidic pH (<4.5). The prevalence of vaginal Lactobacilli increases with increased oestrogen due to increased glycogen production within the vagina. Lactobacilli produce lactic acid, thereby lowering vaginal pH, preventing growth of other bacteria, and lowering microbial diversity. Lower placental oestrogen levels in obese pregnant women could dampen the mechanism to initiate this process, which may be associated with vaginal dysbiosis and unfavourable pregnancy outcomes. Hypothesis. We hypothesize that oestrogen and glycogen levels will be lower, vaginal pH will be higher, and vaginal microbiome diversity will be greater during pregnancy in obese and overweight women compared to healthy weight women. Aim. Pregnancy complications (e.g. preterm birth) are more common in overweight and obese women. If vaginal dysbiosis plays a role, and quantifiable predictors of this increased risk can be determined, these measures could be used to prospectively identify women at risk for pregnancy complications early in pregnancy. Methodology. Vaginal samples were collected at 10–14, 18–24, 26–30, and 34–37 weeks gestation and at delivery from 67 pregnant participants (23 healthy weight, 22 overweight, 22 obese). A blood sample to quantify serum oestrogen was collected at 10–14 weeks. Vaginal samples were collected to test vaginal pH using pH paper, glycogen abundance using fluorometry, and the vaginal microbiome using 16S rRNA amplicon sequencing. Results. Vaginal pH was higher in obese participants compared to healthy weight participants (P=<0.001). Vaginal glycogen levels increased over time in obese participants (P=0.033). The vaginal bacterial alpha diversity was higher in obese participants compared to healthy weight participants (P=0.033). The relative abundances of Peptoniphilus and Anaerococcus were increased in overweight and obese participants, as well as in complicated pregnancies, at 10–14 weeks gestation. Conclusion. The relative abundance of specific vaginal bacteria, like Peptoniphilus and Anaerococcus, in early pregnancy could predict pregnancy outcomes. Our goal is to use the information gathered in this pilot study to further determine the feasibility of assessing the vaginal environment during pregnancy to identify women at risk for negative pregnancy and birth outcomes in the context of a larger study.
Temperature impacts the environmental suitability for malaria transmission by Anopheles gambiae and Anopheles stephensi
Villena, Oswaldo C.; Ryan, Sadie J.; Murdock, Courtney C.; Johnson, Leah R. (Wiley, 2022-08)
Extrinsic environmental factors influence the spatiotemporal dynamics of many organisms, including insects that transmit the pathogens responsible for vector-borne diseases (VBDs). Temperature is an especially important constraint on the fitness of a wide variety of ectothermic insects. A mechanistic understanding of how temperature impacts traits of ectotherms, and thus the distribution of ectotherms and vector-borne infections, is key to predicting the consequences of climate change on transmission of VBDs like malaria. However, the response of transmission to temperature and other drivers is complex, as thermal traits of ectotherms are typically nonlinear, and they interact to determine transmission constraints. In this study, we assess and compare the effect of temperature on the transmission of two malaria parasites, Plasmodium falciparum and Plasmodium vivax, by two malaria vector species, Anopheles gambiae and Anopheles stephensi. We model the nonlinear responses of temperature dependent mosquito and parasite traits (mosquito development rate, bite rate, fecundity, proportion of eggs surviving to adulthood, vector competence, mortality rate, and parasite development rate) and incorporate these traits into a suitability metric based on a model for the basic reproductive number across temperatures. Our model predicts that the optimum temperature for transmission suitability is similar for the four mosquito–parasite combinations assessed in this study, but may differ at the thermal limits. More specifically, we found significant differences in the upper thermal limit between parasites spread by the same mosquito (A. stephensi) and between mosquitoes carrying P. falciparum. In contrast, at the lower thermal limit the significant differences were primarily between the mosquito species that both carried the same pathogen (e.g., A. stephensi and A. gambiae both with P. falciparum). Using prevalence data, we show that the transmission suitability metric (Formula presented.) calculated from our mechanistic model is consistent with observed P. falciparum prevalence in Africa and Asia but is equivocal for P. vivax prevalence in Asia, and inconsistent with P. vivax prevalence in Africa. We mapped risk to illustrate the number of months various areas in Africa and Asia predicted to be suitable for malaria transmission based on this suitability metric. This mapping provides spatially explicit predictions for suitability and transmission risk.
Humidity - The overlooked variable in the thermal biology of mosquito-borne disease
Brown, Joel J.; Pascual, Mercedes; Wimberly, Michael C.; Johnson, Leah R.; Murdock, Courtney C. (Wiley, 2023-07)
Vector-borne diseases cause significant financial and human loss, with billions of dollars spent on control. Arthropod vectors experience a complex suite of environmental factors that affect fitness, population growth and species interactions across multiple spatial and temporal scales. Temperature and water availability are two of the most important abiotic variables influencing their distributions and abundances. While extensive research on temperature exists, the influence of humidity on vector and pathogen parameters affecting disease dynamics are less understood. Humidity is often underemphasized, and when considered, is often treated as independent of temperature even though desiccation likely contributes to declines in trait performance at warmer temperatures. This Perspectives explores how humidity shapes the thermal performance of mosquito-borne pathogen transmission. We summarize what is known about its effects and propose a conceptual model for how temperature and humidity interact to shape the range of temperatures across which mosquitoes persist and achieve high transmission potential. We discuss how failing to account for these interactions hinders efforts to forecast transmission dynamics and respond to epidemics of mosquito-borne infections. We outline future research areas that will ground the effects of humidity on the thermal biology of pathogen transmission in a theoretical and empirical framework to improve spatial and temporal prediction of vector-borne pathogen transmission.
Skin bacterial community differences among three species of co-occurring Ranid frogs
Gajewski, Zachary; Johnson, Leah R.; Medina, Daniel; Crainer, William W.; Nagy, Christopher M.; Belden, Lisa K. (PeerJ, 2023-07-14)
Skin microbial communities are an essential part of host health and can play a role in mitigating disease. Host and environmental factors can shape and alter these microbial communities and, therefore, we need to understand to what extent these factors influence microbial communities and how this can impact disease dynamics. Microbial communities have been studied in amphibian systems due to skin microbial communities providing some resistance to the amphibian chytrid fungus, Batrachochytrium dendrobatidis. However, we are only starting to understand how host and environmental factors shape these communities for amphibians. In this study, we examined whether amphibian skin bacterial communities differ among host species, host infection status, host developmental stage, and host habitat. We collected skin swabs from tadpoles and adults of three Ranid frog species (Lithobates spp.) at the Mianus River Gorge Preserve in Bedford, New York, USA, and used 16S rRNA gene amplicon sequencing to determine bacterial community composition. Our analysis suggests amphibian skin bacterial communities change across host developmental stages, as has been documented previously. Additionally, we found that skin bacterial communities differed among Ranid species, with skin communities on the host species captured in streams or bogs differing from the communities of the species captured on land. Thus, habitat use of different species may drive differences in host-associated microbial communities for closely-related host species.
Variation in temperature of peak trait performance constrains adaptation of arthropod populations to climatic warming
Pawar, Samraat; Huxley, Paul J.; Smallwood, Thomas R. C.; Nesbit, Miles L.; Chan, Alex H. H.; Shocket, Marta S.; Johnson, Leah R.; Kontopoulos, Dimitrios-Georgios; Cator, Lauren J. (Nature Portfolio, 2024-03)
The capacity of arthropod populations to adapt to long-term climatic warming is currently uncertain. Here we combine theory and extensive data to show that the rate of their thermal adaptation to climatic warming will be constrained in two fundamental ways. First, the rate of thermal adaptation of an arthropod population is predicted to be limited by changes in the temperatures at which the performance of four key life-history traits can peak, in a specific order of declining importance: juvenile development, adult fecundity, juvenile mortality and adult mortality. Second, directional thermal adaptation is constrained due to differences in the temperature of the peak performance of these four traits, with these differences expected to persist because of energetic allocation and life-history trade-offs. We compile a new global dataset of 61 diverse arthropod species which provides strong empirical evidence to support these predictions, demonstrating that contemporary populations have indeed evolved under these constraints. Our results provide a basis for using relatively feasible trait measurements to predict the adaptive capacity of diverse arthropod populations to geographic temperature gradients, as well as ongoing and future climatic warming.
Influence of environmental, geographic, socio-demographic, and epidemiological factors on presence of malaria at the community level in two continents
Villena, Oswaldo C.; Arab, Ali; Lippi, Catherine A.; Ryan, Sadie J.; Johnson, Leah R. (Nature Portfolio, 2024-07-20)
The interactions of environmental, geographic, socio-demographic, and epidemiological factors in shaping mosquito-borne disease transmission dynamics are complex and changeable, influencing the abundance and distribution of vectors and the pathogens they transmit. In this study, 27 years of cross-sectional malaria survey data (1990–2017) were used to examine the effects of these factors on Plasmodium falciparum and Plasmodium vivax malaria presence at the community level in Africa and Asia. Monthly long-term, open-source data for each factor were compiled and analyzed using generalized linear models and classification and regression trees. Both temperature and precipitation exhibited unimodal relationships with malaria, with a positive effect up to a point after which a negative effect was observed as temperature and precipitation increased. Overall decline in malaria from 2000 to 2012 was well captured by the models, as was the resurgence after that. The models also indicated higher malaria in regions with lower economic and development indicators. Malaria is driven by a combination of environmental, geographic, socioeconomic, and epidemiological factors, and in this study, we demonstrated two approaches to capturing this complexity of drivers within models. Identifying these key drivers, and describing their associations with malaria, provides key information to inform planning and prevention strategies and interventions to reduce malaria burden.
AedesTraits: A global dataset of temperature-dependent trait responses in Aedes mosquitoes
Da Re, Daniele; Andreo, Veronica; San Miguel, Tomas Valentin; Blaha, Margo; Rosa, Roberto; Rizzoli, Annapaola; Harrison, Joe; Sorek, Sean; Johnson, Leah R.; Huxley, Paul J. (Nature Portfolio, 2025-12-22)
Invasive Aedes mosquitoes are major vectors of arboviral diseases such as dengue, Zika, and chikungunya, posing an increasing threat to global public health. Their recent geographic expansion calls for predictive models to simulate population dynamics and transmission risk. Temperature is a key driver in these models, influencing traits that affect vector competence. Numerous datasets on temperature-dependent traits exist for Aedes aegypti and Aedes albopictus, though they are scattered, inconsistent, and difficult to synthesise. For emerging species like Aedes japonicus and Aedes koreicus, such datasets are scarce. To address these gaps, we developed AedesTraits, an open-access, machine-readable dataset aligned with VecTraits standards. It compiles and systematises experimental data on temperature-dependent traits across these four Aedes species, covering life-history, morphological, physiological, and behavioural traits. Our synthesis highlights existing knowledge gaps and identifies under-studied species and traits. By promoting data systematisation and accessibility, AedesTraits supports Aedes–borne disease modelling and fosters international collaboration in the development of forecasting tools for arbovirus outbreaks.
Impact of maternal obesity and mode of delivery on the newborn skin and oral microbiomes
Seifert, Allison; Ingram, Kelly; Eko, Embelle Ngalame; Nunziato, Jaclyn; Ahrens, Monica; Howell, Brittany R. (Microbiology Society, 2025-04-10)
Introduction. Previous studies have shown vast differences in the skin and oral microbiomes of newborns based on delivery method [Caesarean section (C-section) vs vaginal]. Exposure to or absence of certain bacteria during delivery can impact the neonate’s future susceptibility to infections, allergies or autoimmunity by altering immune functions. Few studies have focused on the impact of maternal obesity on the variations of newborn skin and oral microbiomes. Obese pregnant women typically have a higher vaginal microbiome diversity, and their pregnancies are at higher risk for adverse outcomes and complications. Hypothesis. We hypothesized that the skin and oral microbiomes of newborns born to obese mothers would include more diverse, potentially pathogenic bacteria and that the skin and oral microbiome in C-section delivered newborns would be less diverse than vaginally delivered newborns. Aim. We aim to begin to establish maternal obesity and mode of delivery as factors contributing to increased risk for negative newborn outcomes through impacts on newborn bacterial dysbiosis. Methodology. A skin swab was collected immediately following delivery of 39 newborns from 13 healthy weight body mass index (BMI 18.50–24.99), 11 overweight (BMI 25.0–29.99) and 15 obese (BMI ≥30.00) pregnant participants. An oral swab was collected immediately following delivery for 38 of these newborns from 13 healthy weight, 10 overweight and 15 obese pregnant participants. Bacterial genera were identified via 16S rRNA amplicon sequencing. Results. The newborn skin microbiome was comprised of typical skin bacteria (i.e. Corynebacterium). Newborns of obese participants had a higher relative abundance of Peptoniphilus in their skin microbiome compared to newborns of healthy weight participants (P=0.007). Neonates born via C-section had a higher relative abundance of Ureaplasma in their oral microbiome compared to neonates delivered vaginally (P=0.046). Conclusion. We identified differences in the newborn skin and oral microbiomes based on pre-pregnancy BMI and method of delivery. These differences could be linked to an increased risk of allergies, autoimmune disease and infections. Future longitudinal studies will be crucial in determining the long-term impact of these specific genera on newborn outcomes. Understanding these connections could lead to targeted interventions that reduce the risk of adverse outcomes and improve overall health trajectory.
Spatial Hyperspheric Models for Compositional Data
Schwob, Michael R.; Hooten, Mevin B.; Calzada, Nicholas M.; Keitt, Timothy H. (Institute of Mathematical Statistics, 2025-12)
Compositional observations are an increasingly prevalent data source in spatial statistics. Analysis of such data is typically done on log-ratio transformations or via Dirichlet regression. However, these approaches often make unnecessarily strong assumptions (e.g., strictly positive components, exclusively negative correlations). An alternative approach uses square-root trans-formed compositions and directional distributions. Such distributions naturally allow for zero-valued components and positive correlations, yet they may include support outside the nonnegative orthant and are not generative for compositional data. To overcome this challenge, we truncate the elliptically symmetric angular Gaussian (ESAG) distribution to the nonnegative orthant. Additionally, we propose a spatial hyperspheric regression model that contains fixed and random multivariate spatial effects. The proposed model also contains a term that can be used to propagate uncertainty that may arise from precursory stochastic models (i.e., machine learning classification). We used our model in a simulation study and for a spatial analysis of classified bioacoustic signals of the Dryobates pubescens (downy woodpecker).
LLM-Based Multi-Agent System and Simplicial Self-Supervised Learning Model for Regional Cancer Prevalence Estimation Using Satellite Imagery
Yang, Jiue-An; Chen, Yuzhou; Tribby, Calvin; Lee, Huikyo; Erhunmwunsee, Loretta; Benmarhnia, Tarik; Thompson, Caroline; Gel, Yulia; Jankowska, Marta (ACM, 2025-11-03)
Traditional cancer rate estimations are often limited in spatial resolutions and lack considerations of environmental factors. Satellite imagery has become a vital data source for monitoring diverse urban environments, supporting applications across environmental, socio-demographic, and public health domains. However, while deep learning (DL) tools, particularly convolutional neural networks, have demonstrated strong performance in extracting features from high-resolution imagery, their reliance on local spatial cues often limits their ability to capture complex, non-local, and higher-order structural information. To overcome this limitation, we propose a novel LLM-based multi-agent coordination system for satellite image analysis, which integrates visual and contextual reasoning through a simplicial contrastive learning framework (Agent- SNN). Our Agent-SNN contains two augmented superpixel-based graphs and maximizes mutual information between their latent simplicial complex representations, thereby enabling the system to learn both local and global topological features. The LLM-based agents generate structured prompts that guide the alignment of these representations across modalities. Experiments with satellite imagery of Los Angeles and San Diego demonstrate that Agent-SNN achieves significant improvements over state-of-the-art baselines in regional cancer prevalence estimation tasks.
Revisiting the Convective Like Boundary Layer Assumption in the Urban Option of AERMOD
Retter, Jonathan; Owen, Robert Christopher; Leske, Annamarie; Snyder, Michelle; Sargent, Rhett; Heist, David (MDPI, 2025-11-27)
Urban areas and their surroundings feature unique, horizontally inhomogeneous spatial distributions of land use and land cover, leading to urban heat islands (UHIs) for both air and land surface temperature that complicate the estimation of urban sensible heat flux. The urban dispersion option in AERMOD, the American Meteorological Society (AMS)/Environmental Protection Agency (EPA) Regulatory Model, incorporates this effect at night through a “convective like boundary layer” that modifies the single column meteorology based on a population number representative of the urban area. The model produces positive nighttime sensible heat flux values that often significantly overestimate observed values from the literature. This study re-examines the formulation of the AERMOD urban option assumptions, methodology, and original evaluation against a field study of a power plant in Indianapolis. We investigate replacing the population-based parameterizations of urban–surrounding temperature differences (ΔT) with observations of remotely sensed land surface temperature (LST) data from the Advanced Baseline Imager on the GOES-16/R/East geostationary satellite. We generated a monthly averaged, hourly, wind direction-dependent, clear sky land surface urban heat island ΔT database for 480 continental United States (CONUS) urban areas, as defined by the 2010 US Census. These ΔT values are used to advise city-specific horizontal advection corrections to sensible heat flux estimates that are neglected from simple energy balance models. The four cities of Cleveland, Amarillo, Atlanta, and Baltimore are highlighted, showing that the AERMOD predicted nighttime ΔT values are 794%, 416%, 1048%, and 758% higher, respectively, than the GOES-16 observations. These overestimated ΔT values in AERMOD lead to nighttime sensible heat flux values > 100 W/m² that rival daytime values. However, using the GOES-16 observations as horizontal advection corrections to sensible heat flux results in trends that match the expected neutral to slightly positive nighttime values from observations recorded in the literature. The annual nighttime average in 2021 was −0.8 W/m², 8.6 W/m², 3.0 W/m², and 3.1 W/m² in Cleveland, Amarillo, Atlanta, and Baltimore, respectively, using this approach. Finally, reviewing the initial evaluation with the Indianapolis database against independent studies from the literature suggest that the AERMOD urban option inadvertently implements an urban heat island modeling approach to account for what was a low-level jet during the field study.
Quantile Importance Sampling
Datta, Jyotishka; Polson, Nicholas G. (Institute of Mathematical Statistics, 2023-06-06)
In Bayesian inference, the approximation of integrals of the form ψ = EF l(X) = χ l(x)dF(x) is a fundamental challenge. Such integrals are crucial for evidence estimation, which is important for various purposes, including model selection and numerical analysis. The existing strategies for evidence estimation are classified into four categories: deterministic approximation, density estimation, importance sampling, and vertical representation (Llorente et al., 2023). In this paper, we show that the Riemann sum estimator due to Yakowitz, Krimmel and Szidarovszky (1978) can be used in the context of nested sampling (Skilling, 2006) to achieve a O(n−4) rate of convergence, faster than the usual Ergodic Central Limit Theorem, under certain regularity conditions. We provide a brief overview of the literature on the Riemann sum estimators and the nested sampling algorithm and its connections to vertical likelihood Monte Carlo. We provide theoretical and numerical arguments to show how merging these two ideas may result in improved and more robust estimators for evidence estimation, especially in higher dimensional spaces. We also briefly discuss the idea of simulating the Lorenz curve that avoids the problem of intractable Λ functions, essential for the vertical representation and nested sampling.
The phenotype of recovery XI: associations of sleep quality and perceived stress with discounting and quality of life in substance use recovery
Yeh, Yu-Hua; Zheng, Michelle H.; Tegge, Allison N.; Athamneh, Liqa N.; Freitas-Lemos, Roberta; Dwyer, Candice L.; Bickel, Warren K. (Springer, 2024-06-01)
Purpose: Sleep and stress show an interdependent relationship in physiology, and both are known risk factors for relapse in substance use disorder (SUD) recovery. However, sleep and stress are often investigated independently in addiction research. In this exploratory study, the associations of sleep quality and perceived stress with delay discounting (DD), effort discounting (ED), and quality of life (QOL) were examined concomitantly to determine their role in addiction recovery. DD has been proposed as a prognostic indicator of SUD treatment response, ED is hypothesized to be relevant to the effort to overcome addiction, and QOL is an important component in addiction recovery. Method: An online sample of 118 individuals recovering from SUDs was collected through the International Quit and Recovery Registry. Exhaustive model selection, using the Bayesian Information Criterion to determine the optimal multiple linear model, was conducted to identify variables (i.e., sleep quality, perceived stress, and demographics) contributing to the total variance in DD, ED, and QOL. Results: After model selection, sleep was found to be significantly associated with DD. Stress was found to be significantly associated with psychological health, social relationships, and environment QOL. Both sleep and stress were found to be significantly associated with physical health QOL. Neither sleep nor stress was supported as an explanatory variable of ED. Conclusion: Together, these findings suggest sleep and stress contribute uniquely to the process of addiction recovery. Considering both factors when designing interventions and planning for future research is recommended.
Boldness-Recalibration for Binary Event Predictions
Guthrie, Adeline P.; Franck, Christopher T. (Taylor & Francis, 2024-10-01)
Probability predictions are essential to inform decision making across many fields. Ideally, probability predictions are (i) well calibrated, (ii) accurate, and (iii) bold, that is, spread out enough to be informative for decision making. However, there is a fundamental tension between calibration and boldness, since calibration metrics can be high when predictions are overly cautious, that is, non-bold. The purpose of this work is to develop a Bayesian model selection-based approach to assess calibration, and a strategy for boldness-recalibration that enables practitioners to responsibly embolden predictions subject to their required level of calibration. Specifically, we allow the user to pre-specify their desired posterior probability of calibration, then maximally embolden predictions subject to this constraint. We demonstrate the method with a case study on hockey home team win probabilities and then verify the performance of our procedures via simulation. We find that very slight relaxation of calibration probability (e.g., from 0.99 to 0.95) can often substantially embolden predictions when they are well calibrated and accurate (e.g., widening hockey predictions' range from 26%-78% to 10%-91%).
Flexible cost-penalized Bayesian model selection: Developing inclusion paths with an application to diagnosis of heart disease
Porter, Erica M.; Franck, Christopher T.; Adams, Stephen C. (Wiley, 2024-07-20)
We propose a Bayesian model selection approach that allows medical practitioners to select among predictor variables while taking their respective costs into account. Medical procedures almost always incur costs in time and/or money. These costs might exceed their usefulness for modeling the outcome of interest. We develop Bayesian model selection that uses flexible model priors to penalize costly predictors a priori and select a subset of predictors useful relative to their costs. Our approach (i) gives the practitioner control over the magnitude of cost penalization, (ii) enables the prior to scale well with sample size, and (iii) enables the creation of our proposed inclusion path visualization, which can be used to make decisions about individual candidate predictors using both probabilistic and visual tools. We demonstrate the effectiveness of our inclusion path approach and the importance of being able to adjust the magnitude of the prior's cost penalization through a dataset pertaining to heart disease diagnosis in patients at the Cleveland Clinic Foundation, where several candidate predictors with various costs were recorded for patients, and through simulated data.
The Influence of Ultraprocessed Food Consumption on Energy Intake in Emerging Adulthood: A Controlled Feeding Trial
Rego, Maria L. M.; Leslie, Emma; Schmall, Emily; Capra, Bailey; Hudson, Summer; Ahrens, Monica L.; Katz, Benjamin; Davy, Kevin P.; Hedrick, Valisa E.; DiFeliceantonio, Alexandra G.; Davy, Brenda M. (Wiley, 2025-11-19)
OBJECTIVE: This study examined the impact of a 2-week eucaloric diet high in ultraprocessed foods (UPF) compared to a diet without UPF (non-UPF) on ad libitum energy intake (EI) and food selection in individuals aged 18-25. METHODS: In a randomized, crossover, proof-of-concept trial, participants completed two 14-day controlled feeding periods (81% UPF vs. 0% UPF), with a 4-week washout. Diets were matched for macronutrients, fiber, added sugar, diet quality, and energy density. Following each condition, participants consumed an ad libitum buffet meal including UPF and non-UPF. Energy and food grams consumed were quantified. Statistical analyses were conducted for the full sample, late adolescents (aged 18-21), and young adults (aged 22-25). RESULTS: Twenty-seven individuals aged 22 ± 2 years (mean BMI = 24 ± 3 kg/m2) were included. Diet compliance was ~99% overall. There was no effect of diet condition on meal total kcal or grams consumed or UPF or non-UPF consumption in the full sample (all p > 0.05). In the exploratory age subgroup analysis, an interaction between diet and age was observed for total EI (p < 0.001), where total EI increased among adolescents following the UPF diet (p = 0.03, d = 0.79), but not in young adults. CONCLUSIONS: Late adolescents may be susceptible to increased EI following a UPF diet. Future trials are warranted to evaluate this possibility. TRIAL REGISTRATION: ClinicalTrials.gov: NCT05550818.

Browse

Recent Submissions