Browsing by Author "Datta, Jyotishka"
Now showing 1 - 13 of 13
Results Per Page
Sort Options
- Advancing Emergency Department Efficiency, Infectious Disease Management at Mass Gatherings, and Self-Efficacy Through Data Science and Dynamic ModelingBa-Aoum, Mohammed Hassan (Virginia Tech, 2024-04-09)This dissertation employs management systems engineering principles, data science, and industrial systems engineering techniques to address pressing challenges in emergency department (ED) efficiency, infectious disease management at mass gatherings, and student self-efficacy. It is structured into three essays, each contributing to a distinct domain of research, and utilizes industrial and systems engineering approaches to provide data-driven insights and recommend solutions. The first essay used data analytics and regression analysis to understand how patient length of stay (LOS) in EDs could be influenced by multi-level variables integrating patient, service, and organizational factors. The findings suggested that specific demographic variables, the complexity of service provided, and staff-related variables significantly impacted LOS, offering guidance for operational improvements and better resource allocation. The second essay utilized system dynamics simulations to develop a modified SEIR model for modeling infectious diseases during mass gatherings and assessing the effectiveness of commonly implemented policies. The results demonstrated the significant collective impact of interventions such as visitor limits, vaccination mandates, and mask wearing, emphasizing their role in preventing health crises. The third essay applied machine learning methods to predict student self-efficacy in Muslim societies, revealing the importance of socio-emotional traits, cognitive abilities, and regulatory competencies. It provided a basis for identifying students with varying levels of self-efficacy and developing tailored strategies to enhance their academic and personal success. Collectively, these essays underscore the value of data-driven and evidence-based decision- making. The dissertation's broader impact lies in its contribution to optimizing healthcare operations, informing public health policy, and shaping educational strategies to be more culturally sensitive and psychologically informed. It provides a roadmap for future research and practical applications across the healthcare, public health, and education sectors, fostering advancements that could significantly benefit society.
- Empirical Investigation of Lean Management and Lean Six Sigma Success in Local Government OrganizationsAl rezq, Mohammed Shjea (Virginia Tech, 2024-05-29)Lean Management and Lean Six Sigma (LM/LSS) are improvement methodologies that have been utilized to achieve better performance outcomes at organizational and operational levels. Although there has been evidence of breakthrough improvement across diverse organizational settings, LM/LSS remains an early-stage improvement methodology in public sector organizations, specifically within local government organizations (LGOs). Some LGOs have benefited from LM/LSS and reported significant improvements, such as reducing process time by up to 90% and increasing financial savings by up to 57%. While the success of LM/LSS can lead to satisfactory outcomes, the risk of failure can also result in a tremendous waste of financial and non-financial resources. Evidence from the literature indicates that the failure to achieve the expected outcomes is likely due to the lack of attention paid to critical success factors (CSFs) that are crucial for LM/LSS success. Furthermore, research in this research area regarding characterizing and statistically examining the CSFs associated with LM/LSS in such organizational settings has been limited. Hence, the aim of this research is to provide a comprehensive investigation of the success factors for LM/LSS in LGOs. The initial stage of this dissertation involved analyzing the scientific literature to identify and characterize the CSFs associated with LM/LSS in LGOs through a systematic literature review (SLR). This effort identified a total of 47 unique factors, which were grouped into 5 categories, including organization, process, workforce knowledge, communications, task design, and team design. The next stage of this investigation focused on identifying a more focused set of CSFs. This involved evaluating the strength of the effect (or importance) of the factors using two integrated approaches: meta-synthesis and expert assessment. This process concluded with a total of 29 factors being selected for the empirical field study. The final stage included designing and implementing an online survey questionnaire to solicit LGOs' experience on the presence of factors during the development and/or implementation of LM/LSS and their impact on social-technical system outcomes. Once the survey was concluded, an exploratory factor analysis (EFA) was conducted to identify the underlying latent variables, followed by using a partial least square-structural equation model (PLS-SEM) to determine the significance of the factors on outcomes. The EFA identified three endogenous and five exogenous latent variables. The results of the PLS-SEM model identified four significant positive relationships. Based on the results from the structural paths, the antecedent Improvement Readiness (IR) and Change Awareness (CA) were significant and had a positive influence on Transformation Success (TS). For the outcome Deployment Success (DS), Sustainable Improvement Infrastructure (SII) was the only significant exogenous variable and had the highest positive impact among all significant predictor constructs. Furthermore, Measurement-Based Improvement (MBI) was significant and positively influenced Improvement Project Success (IPS). Findings from this dissertation could serve as a foundation for researchers looking to further advance the maturity of this research area based on the evidence presented in this work. Additionally, this work could be used as guidelines for practitioners in developing implementation processes by considering the essential factors to maximize the success of LM/LSS implementation. Given the diversity of functional areas and processes within LGO contexts, it is also possible that other public sector organizations could benefit from these findings.
- Extending the susceptible-exposed-infected-removed (SEIR) model to handle the false negative rate and symptom-based administration of COVID-19 diagnostic tests: SEIR-fansyBhaduri, Ritwik; Kundu, Ritoban; Purkayastha, Soumik; Kleinsasser, Michael; Beesley, Lauren J.; Mukherjee, Bhramar; Datta, Jyotishka (Wiley, 2022-06-15)False negative rates of severe acute respiratory coronavirus 2 diagnostic tests, together with selection bias due to prioritized testing can result in inaccurate modeling of COVID-19 transmission dynamics based on reported "case" counts. We propose an extension of the widely used Susceptible-Exposed-Infected-Removed (SEIR) model that accounts for misclassification error and selection bias, and derive an analytic expression for the basic reproduction number R0 as a function of false negative rates of the diagnostic tests and selection probabilities for getting tested. Analyzing data from the first two waves of the pandemic in India, we show that correcting for misclassification and selection leads to more accurate prediction in a test sample. We provide estimates of undetected infections and deaths between April 1, 2020 and August 31, 2021. At the end of the first wave in India, the estimated under-reporting factor for cases was at 11.1 (95% CI: 10.7,11.5) and for deaths at 3.58 (95% CI: 3.5,3.66) as of February 1, 2021, while they change to 19.2 (95% CI: 17.9, 19.9) and 4.55 (95% CI: 4.32, 4.68) as of July 1, 2021. Equivalently, 9.0% (95% CI: 8.7%, 9.3%) and 5.2% (95% CI: 5.0%, 5.6%) of total estimated infections were reported on these two dates, while 27.9% (95% CI: 27.3%, 28.6%) and 22% (95% CI: 21.4%, 23.1%) of estimated total deaths were reported. Extensive simulation studies demonstrate the effect of misclassification and selection on estimation of R0 and prediction of future infections. A R-package SEIRfansy is developed for broader dissemination.
- Inference for Populations: Uncertainty Propagation via Bayesian Population SynthesisGrubb, Christopher Thomas (Virginia Tech, 2023-08-16)In this dissertation, we develop a new type of prior distribution, specifically for populations themselves, which we denote the Dirichlet Spacing prior. This prior solves a specific problem that arises when attempting to create synthetic populations from a known subset: the unfortunate reality that assuming independence between population members means that every synthetic population will be essentially the same. This is a problem because any model which only yields one result (several very similar results), when we have very incomplete information, is fundamentally flawed. We motivate our need for this new class of priors using Agent-based Models, though this prior could be used in any situation requiring synthetic populations.
- Merging Two Cultures: Deep and Statistical LearningBhadra, Anindya; Datta, Jyotishka; Polson, Nick; Sokolov, Vadim; Xu, Jianeng (2021-10-21)Merging the two cultures of deep and statistical learning provides insights into structured high-dimensional data. Traditional statistical modeling is still a dominant strategy for structured tabular data. Deep learning can be viewed through the lens of generalized linear models (GLMs) with composite link functions. Sufficient dimensionality reduction (SDR) and sparsity performs nonlinear feature engineering. We show that prediction, interpolation and uncertainty quantification can be achieved using probabilistic methods at the output layer of the model. Thus a general framework for machine learning arises that first generates nonlinear features (a.k.a factors) via sparse regularization and stochastic gradient optimisation and second uses a stochastic output layer for predictive uncertainty. Rather than using shallow additive architectures as in many statistical models, deep learning uses layers of semi affine input transformations to provide a predictive rule. Applying these layers of transformations leads to a set of attributes (a.k.a features) to which predictive statistical methods can be applied. Thus we achieve the best of both worlds: scalability and fast predictive rule construction together with uncertainty quantification. Sparse regularisation with un-supervised or supervised learning finds the features. We clarify the duality between shallow and wide models such as PCA, PPR, RRR and deep but skinny architectures such as autoencoders, MLPs, CNN, and LSTM. The connection with data transformations is of practical importance for finding good network architectures. By incorporating probabilistic components at the output level we allow for predictive uncertainty. For interpolation we use deep Gaussian process and ReLU trees for classification. We provide applications to regression, classification and interpolation. Finally, we conclude with directions for future research.
- A Meta-Analysis of the Protein Components in Rattlesnake VenomDeshwal, Anant; Phan, Phuc; Datta, Jyotishka; Kannan, Ragupathy; Thallapuranam, Suresh Kumar (MDPI, 2021-05-23)The specificity and potency of venom components give them a unique advantage in developing various pharmaceutical drugs. Though venom is a cocktail of proteins, rarely are the synergy and association between various venom components studied. Understanding the relationship between various components of venom is critical in medical research. Using meta-analysis, we observed underlying patterns and associations in the appearance of the toxin families. For Crotalus, Dis has the most associations with the following toxins: PDE; BPP; CRL; CRiSP; LAAO; SVMP P-I and LAAO; SVMP P-III and LAAO. In Sistrurus venom, CTL and NGF have the most associations. These associations can predict the presence of proteins in novel venom and understand synergies between venom components for enhanced bioactivity. Using this approach, the need to revisit the classification of proteins as major components or minor components is highlighted. The revised classification of venom components is based on ubiquity, bioactivity, the number of associations, and synergies. The revised classification can be expected to trigger increased research on venom components, such as NGF, which have high biomedical significance. Using hierarchical clustering, we observed that the genera’s venom compositions were similar, based on functional characteristics rather than phylogenetic relationships.
- Nonparametric Bayes multiresolution testing for high-dimensional rare eventsDatta, Jyotishka; Banerjee, Sayantan; Dunson, David B. (2024-01)In a variety of application areas, there is interest in assessing evidence of differences in the intensity of event realizations between groups. For example, in cancer genomic studies collecting data on rare variants, the focus is on assessing whether and how the variant profile changes with the disease subtype. Motivated by this application, we develop multiresolution nonparametric Bayes tests for differential mutation rates across groups. The multiresolution approach yields fast and accurate detection of spatial clusters of rare variants, and our nonparametric Bayes framework provides great flexibility for modeling the intensities of rare variants. Some theoretical properties are also assessed, including weak consistency of our Dirichlet Process-Poisson-Gamma mixture over multiple resolutions. Simulation studies illustrate excellent small sample properties relative to competitors, and we apply the method to detect rare variants related to common variable immunodeficiency from whole exome sequencing data on 215 patients and over 60,027 control subjects.
- ParadoxesDatta, Jyotishka (International Indian Statistical Association, 2023-12-29)
- Quantifying Changes in Social Polarization Over Time and RegionEdwards, David Linville (Virginia Tech, 2024-07-29)Recent studies indicate that Americans have grown increasingly divided and polarized in recent years cite{boxell2022cross}, cite{hawdon2020social}. This research aims to describe and measure polarization trends across a historical archive of US-based, primarily regional, newspapers. The newspapers chosen are from various US markets to capture any regional differences in the discussion of issues/topics. Our modeling approach employs the Structural Topic Model (STM) to identify topics within a given corpus and measure the tonal differences of articles discussing the same topic. Specifically, we use the STM to infer potentially related articles and a sentiment analyzer called VADER to identify topics with a high level of semantic disparity. Using this method, we assess the polarization of developing and evolving topics, such as sports, politics, and entertainment, and compare how polarization between and within these topics has changed over time. Through this, we create topic-specific sentiment distributions, referred to as polarization distributions. We conclude by demonstrating the usefulness of these distributions in identifying polarization and showing how high polarization aligns with significant social events.
- Quantifying the Effect of Socio-Economic Predictors and the Built Environment on Mental Health Events in Little Rock, AREk, Alfieri; Drawve, Grant; Robinson, Samantha; Datta, Jyotishka (MDPI, 2023-05-18)Law enforcement agencies continue to grow in the use of spatial analysis to assist in identifying patterns of outcomes. Despite the critical nature of proper resource allocation for mental health incidents, there has been little progress in statistical modeling of the geo-spatial nature of mental health events in Little Rock, Arkansas. In this article, we provide insights into the spatial nature of mental health data from Little Rock, Arkansas between 2015 and 2018, under a supervised spatial modeling framework. We provide evidence of spatial clustering and identify the important features influencing such heterogeneity via a spatially informed hierarchy of generalized linear, tree-based, and spatial regression models, viz. the Poisson regression model, the random forest model, the spatial Durbin error model, and the Manski model. The insights obtained from these different models are presented here along with their relative predictive performances. The inferential tools developed here can be used in a broad variety of spatial modeling contexts and have the potential to aid both law enforcement agencies and the city in properly allocating resources. We were able to identify several built-environment and socio-demographic measures related to mental health calls while noting that the results indicated that there are unmeasured factors that contribute to the number of events.
- Quantile Importance SamplingDatta, Jyotishka; Polson, Nicholas G. (2023-05-04)
- Time-to-Event Modeling with Bayesian Perspectives and Applications in Reliability of Artificial Intelligence SystemsMin, Jie (Virginia Tech, 2024-07-02)
- Understanding racial disparities in severe maternal morbidity using Bayesian network analysisRezaeiahari, Mandana; Brown, Clare C.; Ali, Mir M.; Datta, Jyotishka; Tilford, J. Mick (PLoS, 2021-10-01)Previous studies have evaluated the marginal effect of various factors on the risk of severe maternal morbidity (SMM) using regression approaches. We add to this literature by utilizing a Bayesian network (BN) approach to understand the joint effects of clinical, demographic, and area-level factors. We conducted a retrospective observational study using linked birth certificate and insurance claims data from the Arkansas All-Payer Claims Database (APCD), for the years 2013 through 2017. We used various learning algorithms and measures of arc strength to choose the most robust network structure. We then performed various conditional probabilistic queries using Monte Carlo simulation to understand disparities in SMM. We found that anemia and hypertensive disorder of pregnancy may be important clinical comorbidities to target in order to reduce SMM overall as well as racial disparities in SMM.