WiSDM: a platform for crowd-sourced data acquisition, analytics, and synthetic data generation

dc.contributor.authorChoudhury, Ananyaen
dc.contributor.committeechairMarathe, Madhav Vishnuen
dc.contributor.committeememberBisset, Keith R.en
dc.contributor.committeememberGupta, Sandeepen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2016-08-16T08:00:13Zen
dc.date.available2016-08-16T08:00:13Zen
dc.date.issued2016-08-15en
dc.description.abstractHuman behavior is a key factor influencing the spread of infectious diseases. Individuals adapt their daily routine and typical behavior during the course of an epidemic -- the adaptation is based on their perception of risk of contracting the disease and its impact. As a result, it is desirable to collect behavioral data before and during a disease outbreak. Such data can help in creating better computer models that can, in turn, be used by epidemiologists and policy makers to better plan and respond to infectious disease outbreaks. However, traditional data collection methods are not well suited to support the task of acquiring human behavior related information; especially as it pertains to epidemic planning and response. Internet-based methods are an attractive complementary mechanism for collecting behavioral information. Systems such as Amazon Mechanical Turk (MTurk) and online survey tools provide simple ways to collect such information. This thesis explores new methods for information acquisition, especially behavioral information that leverage this recent technology. Here, we present the design and implementation of a crowd-sourced surveillance data acquisition system -- WiSDM. WiSDM is a web-based application and can be used by anyone with access to the Internet and a browser. Furthermore, it is designed to leverage online survey tools and MTurk; WiSDM can be embedded within MTurk in an iFrame. WiSDM has a number of novel features, including, (i) ability to support a model-based abductive reasoning loop: a flexible and adaptive information acquisition scheme driven by causal models of epidemic processes, (ii) question routing: an important feature to increase data acquisition efficacy and reduce survey fatigue and (iii) integrated surveys: interactive surveys to provide additional information on survey topic and improve user motivation. We evaluate the framework's performance using Apache JMeter and present our results. We also discuss three other extensions of WiSDM: Adapter, Synthetic Data Generator, and WiSDM Analytics. The API Adapter is an ETL extension of WiSDM which enables extracting data from disparate data sources and loading to WiSDM database. The Synthetic Data Generator allows epidemiologists to build synthetic survey data using NDSSL's Synthetic Population as agents. WiSDM Analytics empowers users to perform analysis on the data by writing simple python code using Versa APIs. We also propose a data model that is conducive to survey data analysis.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:8256en
dc.identifier.urihttp://hdl.handle.net/10919/72256en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectHuman Computationen
dc.subjectCrowd Sourcingen
dc.subjectSynthetic Dataseten
dc.subjectData Analyticsen
dc.subjectData Visualizationen
dc.subjectEpidemiologyen
dc.subjectWiSDMen
dc.titleWiSDM: a platform for crowd-sourced data acquisition, analytics, and synthetic data generationen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Choudhury_A_T_2016.pdf
Size:
6.46 MB
Format:
Adobe Portable Document Format

Collections