Browsing by Author "Chen, Jiangzhuo"
Now showing 1 - 14 of 14
Results Per Page
Sort Options
- Comparing Effectiveness of Top-Down and Bottom-Up Strategies in Containing InfluenzaMarathe, Achla; Lewis, Bryan L.; Barrett, Christopher L.; Chen, Jiangzhuo; Marathe, Madhav V.; Eubank, Stephen; Ma, Yifei (Public Library of Science, 2011-09-22)This research compares the performance of bottom-up, self-motivated behavioral interventions with top-down interventions targeted at controlling an “Influenza-like-illness”. Both types of interventions use a variant of the ring strategy. In the first case, when the fraction of a person's direct contacts who are diagnosed exceeds a threshold, that person decides to seek prophylaxis, e.g. vaccine or antivirals; in the second case, we consider two intervention protocols, denoted Block and School: when a fraction of people who are diagnosed in a Census Block (resp., School) exceeds the threshold, prophylax the entire Block (resp., School). Results show that the bottom-up strategy outperforms the top-down strategies under our parameter settings. Even in situations where the Block strategy reduces the overall attack rate well, it incurs a much higher cost. These findings lend credence to the notion that if people used antivirals effectively, making them available quickly on demand to private citizens could be a very effective way to control an outbreak.
- Data Integration Methodologies and Services for Evaluation and Forecasting of EpidemicsDeodhar, Suruchi (Virginia Tech, 2016-05-31)Most epidemiological systems described in the literature are built for evaluation and analysis of specific diseases, such as Influenza-like-illness. The modeling environments that support these systems are implemented for specific diseases and epidemiological models. Hence they are not reusable or extendable. This thesis focuses on the design and development of an integrated analytical environment with flexible data integration methodologies and multi-level web services for evaluation and forecasting of various epidemics in different regions of the world. The environment supports analysis of epidemics based on any combination of disease, surveillance sources, epidemiological models, geographic regions and demographic factors. The environment also supports evaluation and forecasting of epidemics when various policy-level and behavioral interventions are applied, that may inhibit the spread of an epidemic. First, we describe data integration methodologies and schema design, for flexible experiment design, storage and query retrieval mechanisms related to large scale epidemic data. We describe novel techniques for data transformation, optimization, pre-computation and automation that enable flexibility, extendibility and efficiency required in different categories of query processing. Second, we describe the design and engineering of adaptable middleware platforms based on service-oriented paradigms for interactive workflow, communication, and decoupled integration. This supports large-scale multi-user applications with provision for online analysis of interventions as well as analytical processing of forecast computations. Using a service-oriented architecture, we have provided a platform-as-a-service representation for evaluation and forecasting of epidemics. We demonstrate the applicability of our integrated environment through development of the applications, DISIMS and EpiCaster. DISIMS is an interactive web-based system for evaluating the effects of dynamic intervention strategies on epidemic propagation. EpiCaster is a situation assessment and forecasting tool for projecting the state of evolving epidemics such as flu and Ebola in different regions of the world. We discuss how our platform uses existing technologies to solve a novel problem in epidemiology, and provides a unique solution on which different applications can be built for analyzing epidemic containment strategies.
- A Database Supported Modeling Environment for Pandemic Planning and Course of Action AnalysisMa, Yifei (Virginia Tech, 2013-06-24)Pandemics can significantly impact public health and society, for instance, the 2009 H1N1
and the 2003 SARS. In addition to analyzing the historic epidemic data, computational simulation of epidemic propagation processes and disease control strategies can help us understand the spatio-temporal dynamics of epidemics in the laboratory. Consequently, the public can be better prepared and the government can control future epidemic outbreaks more effectively. Recently, epidemic propagation simulation systems, which use high performance computing technology, have been proposed and developed to understand disease propagation processes. However, run-time infection situation assessment and intervention adjustment, two important steps in modeling disease propagation, are not well supported in these simulation systems. In addition, these simulation systems are computationally efficient in their simulations, but most of them have
limited capabilities in terms of modeling interventions in realistic scenarios.
In this dissertation, we focus on building a modeling and simulation environment for epidemic propagation and propagation control strategy. The objective of this work is to
design such a modeling environment that both supports the previously missing functions,
meanwhile, performs well in terms of the expected features such as modeling fidelity,
computational efficiency, modeling capability, etc. Our proposed methodologies to build
such a modeling environment are: 1) decoupled and co-evolving models for disease propagation, situation assessment, and propagation control strategy, and 2) assessing situations and simulating control strategies using relational databases. Our motivation for exploring these methodologies is as follows: 1) a decoupled and co-evolving model allows us to design modules for each function separately and makes this complex modeling system design simpler, and 2) simulating propagation control strategies using relational databases improves the modeling capability and human productivity of using this modeling environment. To evaluate our proposed methodologies, we have designed and built a loosely coupled and database supported epidemic modeling and simulation environment. With detailed experimental results and realistic case studies, we demonstrate that our modeling environment provides the missing functions and greatly enhances many expected features, such as modeling capability, without significantly sacrificing computational efficiency and scalability. - A Distributed Approach to EpiFast using Apache SparkKannan, Vijayasarathy (Virginia Tech, 2015-08-04)EpiFast is a parallel algorithm for large-scale epidemic simulations, based on an interpretation of the stochastic disease propagation in a contact network. The original EpiFast implementation is based on a master-slave computation model with a focus on distributed memory using message-passing-interface (MPI). However, it suffers from few shortcomings with respect to scale of networks being studied. This thesis addresses these shortcomings and provides two different implementations: Spark-EpiFast based on the Apache Spark big data processing engine and Charm-EpiFast based on the Charm++ parallel programming framework. The study focuses on exploiting features of both systems that we believe could potentially benefit in terms of performance and scalability. We present models of EpiFast specific to each system and relate algorithm specifics to several optimization techniques. We also provide a detailed analysis of these optimizations through a range of experiments that consider scale of networks and environment settings we used. Our analysis shows that the Spark-based version is more efficient than the Charm++ and MPI-based counterparts. To the best of our knowledge, ours is one of the preliminary efforts of using Apache Spark for epidemic simulations. We believe that our proposed model could act as a reference for similar large-scale epidemiological simulations exploring non-MPI or MapReduce-like approaches.
- Effect of modelling slum populations on influenza spread in DelhiChen, Jiangzhuo; Chu, Shuyu; Chungbaek, Youngyun; Khan, Maleq; Kuhlman, Christopher J.; Marathe, Achla; Mortveit, Henning; Vullikanti, Anil; Xie, Dawen (BMJ, 2016-01-01)
- Feedback Between Behavioral Adaptations and Disease DynamicsChen, Jiangzhuo; Marathe, Achla; Marathe, Madhav V. (Springer Nature, 2018-08-20)We study the feedback processes between individual behavior, disease prevalence, interventions and social networks during an influenza pandemic when a limited stockpile of antivirals is shared between the private and the public sectors. An economic model that uses prevalence-elastic demand for interventions is combined with a detailed social network and a disease propagation model to understand the feedback mechanism between epidemic dynamics, market behavior, individual perceptions, and the social network. An urban and a rural region are simulated to assess the robustness of results. Results show that an optimal split between the private and public sectors can be reached to contain the disease but the accessibility of antivirals from the private sector is skewed towards the richest income quartile. Also, larger allocations to the private sector result in wastage where individuals who do not need it are able to purchase it but who need it cannot afford it. Disease prevalence increases with household size and total contact time but not by degree in the social network, whereas wastage of antivirals decreases with degree and contact time. The best utilization of drugs is achieved when individuals with high contact time use them, who tend to be the school-aged children of large families.
- Forecasting influenza activity using machine-learned mobility mapVenkatramanan, Srinivasan; Sadilek, Adam; Fadikar, Arindam; Barrett, Christopher L.; Biggerstaff, Matthew; Chen, Jiangzhuo; Dotiwalla, Xerxes; Eastham, Paul; Gipson, Bryant; Higdon, Dave; Kucuktunc, Onur; Lieber, Allison; Lewis, Bryan L.; Reynolds, Zane; Vullikanti, Anil Kumar S.; Wang, Lijing; Marathe, Madhav V. (2021-02-09)Human mobility is a primary driver of infectious disease spread. However, existing data is limited in availability, coverage, granularity, and timeliness. Data-driven forecasts of disease dynamics are crucial for decision-making by health officials and private citizens alike. In this work, we focus on a machine-learned anonymized mobility map (hereon referred to as AMM) aggregated over hundreds of millions of smartphones and evaluate its utility in forecasting epidemics. We factor AMM into a metapopulation model to retrospectively forecast influenza in the USA and Australia. We show that the AMM model performs on-par with those based on commuter surveys, which are sparsely available and expensive. We also compare it with gravity and radiation based models of mobility, and find that the radiation model's performance is quite similar to AMM and commuter flows. Additionally, we demonstrate our model's ability to predict disease spread even across state boundaries. Our work contributes towards developing timely infectious disease forecasting at a global scale using human mobility datasets expanding their applications in the area of infectious disease epidemiology. Human mobility plays a central role in the spread of infectious diseases and can help in forecasting incidence. Here the authors show a comparison of multiple mobility benchmarks in forecasting influenza, and demonstrate the value of a machine-learned mobility map with global coverage at multiple spatial scales.
- A framework for evaluating epidemic forecastsTabataba, Farzaneh Sadat; Chakraborty, Prithwish; Ramakrishnan, Naren; Venkatramanan, Srinivasan; Chen, Jiangzhuo; Lewis, Bryan L.; Marathe, Madhav V. (2017-05-15)Background Over the past few decades, numerous forecasting methods have been proposed in the field of epidemic forecasting. Such methods can be classified into different categories such as deterministic vs. probabilistic, comparative methods vs. generative methods, and so on. In some of the more popular comparative methods, researchers compare observed epidemiological data from the early stages of an outbreak with the output of proposed models to forecast the future trend and prevalence of the pandemic. A significant problem in this area is the lack of standard well-defined evaluation measures to select the best algorithm among different ones, as well as for selecting the best possible configuration for a particular algorithm. Results In this paper we present an evaluation framework which allows for combining different features, error measures, and ranking schema to evaluate forecasts. We describe the various epidemic features (Epi-features) included to characterize the output of forecasting methods and provide suitable error measures that could be used to evaluate the accuracy of the methods with respect to these Epi-features. We focus on long-term predictions rather than short-term forecasting and demonstrate the utility of the framework by evaluating six forecasting methods for predicting influenza in the United States. Our results demonstrate that different error measures lead to different rankings even for a single Epi-feature. Further, our experimental analyses show that no single method dominates the rest in predicting all Epi-features when evaluated across error measures. As an alternative, we provide various Consensus Ranking schema that summarize individual rankings, thus accounting for different error measures. Since each Epi-feature presents a different aspect of the epidemic, multiple methods need to be combined to provide a comprehensive forecast. Thus we call for a more nuanced approach while evaluating epidemic forecasts and we believe that a comprehensive evaluation framework, as presented in this paper, will add value to the computational epidemiology community.
- Modeling, Analysis and Comparison of Large Scale Social Contact Networks on Epidemic StudiesXia, Huadong (Virginia Tech, 2015-04-07)Social contact networks represent proximity relationships between individual agents. Such networks are useful in diverse applications, including epidemiology, wireless networking and urban resilience. The vertices of a social contact network represent individual agents (e.g. people). Time varying edges represent time varying proximity relationship. The networks are relational -- node and edge labels represent important demographic, spatial and temporal attributes. Synthesizing social contact networks that span large urban regions is challenging for several reasons including: spatial, temporal and relational variety of data sources, noisy and incomplete data, and privacy and confidentiality requirements. Moreover, the synthesized networks differ due to the data and methods used to synthesize them. This dissertation undertakes a systematic study of synthesizing urban scale social contact networks within the specific application context of computational epidemiology. It is motivated by three important questions: (i) How does one construct a realistic social contact network that is adaptable to different levels of data availability? (ii) How does one compare different versions of the network for a given region, and what are appropriate metrics when comparing the relational networks? (iii) When does a network have adequate structural details for the specific application we have. We study these questions by synthesizing three social contact networks for Delhi, India. Our case study suggests that we can iteratively improve the quality of a network by adapting to the best data sources available within a framework. The networks differ by the data and the models used. We carry out detailed comparative analyses of the networks. The analysis has three components: (i) structure analysis that compares the structural properties of the networks, (ii) dynamics analysis that compares the epidemic dynamics on these networks and (iii) policy analysis that compares the efficacy of various interventions. We have proposed a framework to systematically analyze how details in networks impact epidemic dynamics over these networks. The results suggest that a combination of multi-level metrics instead of any individual one should be used to compare two networks. We further investigate the sensitivity of these models. The study reveals the details necessary for particular class of control policies. Our methods are entirely general and can be applied to other areas of network science.
- My4Sight: A Human Computation Platform for Improving Flu PredictionsAkupatni, Vivek Bharath (Virginia Tech, 2015-09-17)While many human computation (human-in-the-loop) systems exist in the field of Artificial Intelligence (AI) to solve problems that can't be solved by computers alone, comparatively fewer platforms exist for collecting human knowledge, and evaluation of various techniques for harnessing human insights in improving forecasting models for infectious diseases, such as Influenza and Ebola. In this thesis, we present the design and implementation of My4Sight, a human computation system developed to harness human insights and intelligence to improve forecasting models. This web-accessible system simplifies the collection of human insights through the careful design of the following two tasks: (i) asking users to rank system-generated forecasts in order of likelihood; and (ii) allowing users to improve upon an existing system-generated prediction. The structured output collected from querying human computers can then be used in building better forecasting models. My4Sight is designed to be a complete end-to- end analytical platform, and provides access to data collection features and statistical tools that are applied to the collected data. The results are communicated to the user, wherever applicable, in the form of visualizations for easier data comprehension. With My4Sight, this thesis makes a valuable contribution to the field of epidemiology by providing the necessary data and infrastructure platform to improve forecasts in real time by harnessing the wisdom of the crowd.
- Relational Computing Using HPC Resources: Services and OptimizationsSoundarapandian, Manikandan (Virginia Tech, 2015-09-15)Computational epidemiology involves processing, analysing and managing large volumes of data. Such massive datasets cannot be handled efficiently by using traditional standalone database management systems, owing to their limitation in the degree of computational efficiency and bandwidth to scale to large volumes of data. In this thesis, we address management and processing of large volumes of data for modeling, simulation and analysis in epidemiological studies. Traditionally, compute intensive tasks are processed using high performance computing resources and supercomputers whereas data intensive tasks are delegated to standalone databases and some custom programs. DiceX framework is a one-stop solution for distributed database management and processing and its main mission is to leverage and utilize supercomputing resources for data intensive computing, in particular relational data processing. While standalone databases are always on and a user can submit queries at any time for required results, supercomputing resources must be acquired and are available for a limited time period. These resources are relinquished either upon completion of execution or at the expiration of the allocated time period. This kind of reservation based usage style poses critical challenges, including building and launching a distributed data engine onto the supercomputer, saving the engine and resuming from the saved image, devising efficient optimization upgrades to the data engine and enabling other applications to seamlessly access the engine . These challenges and requirements cause us to align our approach more closely with cloud computing paradigms of Infrastructure as a Service(IaaS) and Platform as a Service(PaaS). In this thesis, we propose cloud computing like workflows, but using supercomputing resources to manage and process relational data intensive tasks. We propose and implement several services including database freeze and migrate and resume, ad-hoc resource addition and table redistribution. These services assist in carrying out the workflows defined. We also propose an optimization upgrade to the query planning module of postgres-XC, the core relational data processing engine of the DiceX framework. With a knowledge of domain semantics, we have devised a more robust data distribution strategy that would enable to push down most time consuming sql operations forcefully to the postgres-XC data nodes, bypassing its query planner's default shippability criteria without compromising correctness. Forcing query push down reduces the query processing time by a factor of almost 40%-60% for certain complex spatio-temporal queries on our epidemiology datasets. As part of this work, a generic broker service has also been implemented, which acts as an interface to the DiceX framework by exposing restful apis, which applications can make use of to query and retrieve results irrespective of the programming language or environment.
- Sensitivity of Household Transmission to Household Contact Structure and SizeMarathe, Achla; Lewis, Bryan L.; Chen, Jiangzhuo; Eubank, Stephen (PLOS, 2011-08-01)Objective: Study the influence of household contact structure on the spread of an influenza-like illness. Examine whether changes to in-home care giving arrangements can significantly affect the household transmission counts. Method: We simulate two different behaviors for the symptomatic person; either s/he remains at home in contact with everyone else in the household or s/he remains at home in contact with only the primary caregiver in the household. The two different cases are referred to as full mixing and single caregiver, respectively. Results: The results show that the household’s cumulative transmission count is lower in case of a single caregiver configuration than in the full mixing case. The household transmissions vary almost linearly with the household size in both single caregiver and full mixing cases. However the difference in household transmissions due to the difference in household structure grows with the household size especially in case of moderate flu. Conclusions: These results suggest that details about human behavior and household structure do matter in epidemiological models. The policy of home isolation of the sick has significant effect on the household transmission count depending upon the household size.
- Spatio-temporal Event Detection and Forecasting in Social MediaZhao, Liang (Virginia Tech, 2016-08-01)Nowadays, knowledge discovery on social media is attracting growing interest. Social media has become more than a communication tool, effectively functioning as a social sensor for our society. This dissertation focuses on the development of methods for social media-based spatiotemporal event detection and forecasting for a variety of event topics and assumptions. Five methods are proposed, namely dynamic query expansion for event detection, a generative framework for event forecasting, multi-task learning for spatiotemporal event forecasting, multi-source spatiotemporal event forecasting, and deep learning based epidemic modeling for forecasting influenza outbreaks. For the first of these methods, existing solutions for spatiotemporal event detection are mostly supervised and lack the flexibility to handle the dynamic keywords used in social media. The contributions of this work are: (1) Develop an unsupervised framework; (2) Design a novel dynamic query expansion method; and (3) Propose an innovative local modularity spatial scan algorithm. For the second of these methods, traditional solutions are unable to capture the spatiotemporal context, model mixed-type observations, or utilize prior geographical knowledge. The contributions of this work include: (1) Propose a novel generative model for spatial event forecasting; (2) Design an effective algorithm for model parameter inference; and (3) Develop a new sequence likelihood calculation method. For the third method, traditional solutions cannot deal with spatial heterogeneity or handle the dynamics of social media data effectively. This work's contributions include: (1) Formulate a multi-task learning framework for event forecasting; (2) simultaneously model static and dynamic terms; and (3) Develop efficient parameter optimization algorithms. For the fourth method, traditional multi-source solutions typically fail to consider the geographical hierarchy or cope with incomplete data blocks among different sources. The contributions here are: (1) Design a framework for event forecasting based on hierarchical multi-source indicators; (2) Propose a robust model for geo-hierarchical feature selection; and (3) Develop an efficient algorithm for model parameter optimization. For the last method, existing work on epidemic modeling either cannot ensure timeliness, or cannot characterize the underlying epidemic propagation mechanisms. The contributions of this work include: (1) Propose a novel integrated framework for computational epidemiology and social media mining; (2) Develop a semi-supervised multilayer perceptron for mining epidemic features; and (3) Design an online training algorithm.
- Using data-driven agent-based models for forecasting emerging infectious diseasesVenkatramanan, Srinivasan; Lewis, Bryan L.; Chen, Jiangzhuo; Higdon, Dave; Vullikanti, Anil Kumar S.; Marathe, Madhav V. (Elsevier, 2017-02-22)Producing timely, well-informed and reliable forecasts for an ongoing epidemic of an emerging infectious disease is a huge challenge. Epidemiologists and policy makers have to deal with poor data quality, limited understanding of the disease dynamics, rapidly changing social environment and the uncertainty on effects of various interventions in place. Under this setting, detailed computational models providea comprehensive framework for integrating diverse data sources into a well-defined model of disease dynamics and social behavior, potentially leading to better understanding and actions. In this paper,we describe one such agent-based model framework developed for forecasting the 2014–2015 Ebola epidemic in Liberia, and subsequently used during the Ebola forecasting challenge. We describe the various components of the model, the calibration process and summarize the forecast performance across scenarios of the challenge. We conclude by highlighting how such a data-driven approach can be refinedand adapted for future epidemics, and share the lessons learned over the course of the challenge.