Data Integration Methodologies and Services for Evaluation and Forecasting of Epidemics

TR Number
Journal Title
Journal ISSN
Volume Title
Virginia Tech

Most epidemiological systems described in the literature are built for evaluation and analysis of specific diseases, such as Influenza-like-illness. The modeling environments that support these systems are implemented for specific diseases and epidemiological models. Hence they are not reusable or extendable.

This thesis focuses on the design and development of an integrated analytical environment with flexible data integration methodologies and multi-level web services for evaluation and forecasting of various epidemics in different regions of the world. The environment supports analysis of epidemics based on any combination of disease, surveillance sources, epidemiological models, geographic regions and demographic factors. The environment also supports evaluation and forecasting of epidemics when various policy-level and behavioral interventions are applied, that may inhibit the spread of an epidemic.

First, we describe data integration methodologies and schema design, for flexible experiment design, storage and query retrieval mechanisms related to large scale epidemic data. We describe novel techniques for data transformation, optimization, pre-computation and automation that enable flexibility, extendibility and efficiency required in different categories of query processing. Second, we describe the design and engineering of adaptable middleware platforms based on service-oriented paradigms for interactive workflow, communication, and decoupled integration. This supports large-scale multi-user applications with provision for online analysis of interventions as well as analytical processing of forecast computations. Using a service-oriented architecture, we have provided a platform-as-a-service representation for evaluation and forecasting of epidemics.

We demonstrate the applicability of our integrated environment through development of the applications, DISIMS and EpiCaster. DISIMS is an interactive web-based system for evaluating the effects of dynamic intervention strategies on epidemic propagation. EpiCaster is a situation assessment and forecasting tool for projecting the state of evolving epidemics such as flu and Ebola in different regions of the world. We discuss how our platform uses existing technologies to solve a novel problem in epidemiology, and provides a unique solution on which different applications can be built for analyzing epidemic containment strategies.

Computational Epidemiology, Data Integration, Big Data Analytics, Epidemic Forecasting, Web Services, Service Oriented Architecture, Predictive Analytics