Show simple item record

dc.contributor.authorKaw, Rushien_US
dc.date.accessioned2014-08-31T08:00:23Z
dc.date.available2014-08-31T08:00:23Z
dc.date.issued2014-08-30en_US
dc.identifier.othervt_gsexam:3630en_US
dc.identifier.urihttp://hdl.handle.net/10919/50434
dc.description.abstractScalability is an important problem in epidemiological applications that simulate complex intervention scenarios over large datasets. Indemics is one such interactive data intensive framework for High-performance computing(HPC) based large-scale epidemic simulations. In the Indemics framework, interventions are supplied from an external, standalone database which proved to be an effective way of implementing interventions. Although this setup performs well for simple interventions and small datasets, performance and scalability of complex interventions and large datasets remain an issue. In this thesis, we present IndemicsXC, a scalable and massively parallel high-performance data engine for Indemics in a supercomputing environment. IndemicsXC has the ability to implement complex interventions over large datasets. Our distributed database solution retains the simplicity of Indemics by using the same SQL query interface for expressing interventions. We show that our solution implements the most complex interventions by intelligently offloading them to the supercomputer nodes and processing them in parallel. We present an extensive performance evaluation of our database engine with the help of various intervention case studies over synthetic population datasets. The evaluation of our parallel and distributed database framework illustrates its scalability over standalone database. Our results show that the distributed data engine is efficient as it is parallel, scalable and cost-efficient means of implementing interventions. The proposed cost-model in this thesis could be used to approximate intervention query execution time with decent accuracy. The usefulness of our distributed database framework could be leveraged for fast, accurate and sensible decisions by the public health officials during an outbreak. Finally, we discuss the considerations for using distributed databases for driving large-scale simulations.en_US
dc.format.mediumETDen_US
dc.publisherVirginia Techen_US
dc.rightsThis Item is protected by copyright and/or related rights. Some uses of this Item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s).en_US
dc.subjectepidemic simulationen_US
dc.subjectdistributed systemen_US
dc.subjectdatabase systemen_US
dc.titleModeling and Computation of Complex Interventions in Large-scale Epidemiological Simulations using SQL and Distributed Databaseen_US
dc.typeThesisen_US
dc.contributor.departmentComputer Scienceen_US
dc.description.degreeMaster of Scienceen_US
thesis.degree.nameMaster of Scienceen_US
thesis.degree.levelmastersen_US
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen_US
thesis.degree.disciplineComputer Science and Applicationsen_US
dc.contributor.committeechairMarathe, Madhav Vishnuen_US
dc.contributor.committeememberGupta, Sandeepen_US
dc.contributor.committeememberPrakash, Bodicherla Adityaen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record