Epidemiology Experimentation and Simulation Management through Scientific Digital Libraries

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Advances in scientific data management, discovery, dissemination, and sharing are changing the manner in which scientific studies are being conducted and repurposed. Data-intensive scientific practices increasingly require data management related services not available in existing digital libraries. Complicating the issue are the diversity of functional requirements and content in scientific domains as well as scientists' lack of expertise in information and library sciences.

Researchers that utilize simulation and experimentation systems need digital libraries to maintain datasets, input configurations, results, analyses, and related documents. A digital library may be integrated with simulation infrastructures to provide automated support for research components, e.g., simulation interfaces to models, data warehouses, simulation applications, computational resources, and storage systems. Managing and provisioning simulation content allows streamlined experimentation, collaboration, discovery, and content reuse within a simulation community. Formal definitions of this class of digital libraries provide a foundation for producing a software toolkit and the semi-automated generation of digital library instances.

We present a generic, component-based SIMulation-supporting Digital Library (SimDL) framework. The framework is formally described and provides a deployable set of domain-free services, schema-based domain knowledge representations, and extensible lower and higher level service abstractions. Services in SimDL are specialized for semi-structured simulation content and large-scale data producing infrastructures, as exemplified in data storage, indexing, and retrieval service implementations. Contributions to the scientific community include previously unavailable simulation-specific services, e.g., incentivizing public contributions, semi-automated content curating, and memoizing simulation-generated data products. The practicality of SimDL is demonstrated through several case studies in computational epidemiology and network science as well as performance evaluations.



Computational Epidemiology, Digital Libraries, Simulation and Experimentation Management, 5S Formalisms