Solving Mysteries with Crowds: Supporting Crowdsourced Sensemaking with a Modularized Pipeline and Context Slices

Li, Tianyi

Solving Mysteries with Crowds: Supporting Crowdsourced Sensemaking with a Modularized Pipeline and Context Slices

dc.contributor.author	Li, Tianyi	en
dc.contributor.committeechair	North, Christopher L.	en
dc.contributor.committeechair	Luther, Kurt	en
dc.contributor.committeemember	Convertino, Gregorio	en
dc.contributor.committeemember	Kavanaugh, Andrea L.	en
dc.contributor.committeemember	Wang, Gang Alan	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2020-09-10T06:00:20Z	en
dc.date.available	2020-09-10T06:00:20Z	en
dc.date.issued	2020-07-28	en
dc.description.abstract	The increasing volume and complexity of text data are challenging the cognitive capabilities of expert analysts. Machine learning and crowdsourcing present new opportunities for large-scale sensemaking, but it remains a challenge to model the overall process so that many distributed agents can contribute to suitable components asynchronously and meaningfully. In this work, I explore how to crowdsource sensemaking for intelligence analysis. Specifically, I focus on the complex processes that include developing hypotheses and theories from a raw dataset and iteratively refining the analysis. I first developed Connect the Dots, a web application that implements the concept of "context slices" and supports novice crowds in building relationship networks for exploratory analysis. Then I developed CrowdIA, a software platform that implements the entire crowd sensemaking pipeline and the context slicing for each step, to enable unsupervised crowd sensemaking. Using the pipeline as a testbed, I probed the errors and bottlenecks in crowdsourced sensemaking,and suggested design recommendations for integrated crowdsourcing systems. Building on these insights and to support iterative crowd sensemaking, I developed the concept of "crowd auditing" in which an auditor examines a pipeline of crowd analyses and diagnoses the problems to steer future refinement. I explored the design space to support crowd auditing and developed CrowdTrace, a crowd auditing tool that enables novice auditors to effectively identify the important problems with the crowd analysis and create microtasks for crowd workers to fix the problems.The core contributions of this work include a pipeline that enables distributed crowd collaboration to holistic sensemaking processes, two novel concepts of "context slices" and "crowd auditing", web applications that support crowd sensemaking and auditing, as well as design implications for crowd sensemaking systems. The hope is that the crowd sensemaking pipeline can serve to accelerate research on sensemaking, and contribute to helping people conduct in-depth investigations of large collections of information.	en
dc.description.abstractgeneral	In today's world, we have access to large amounts of data that provide opportunities to solve problems at unprecedented depths and scales. While machine learning offers powerful capabilities to support data analysis, to extract meaning from raw data is cognitively demanding and requires significant person-power. Crowdsourcing aggregates human intelligence, yet it remains a challenge for many distributed agents to collaborate asynchronously and meaningfully. The contribution of this work is to explore how to use crowdsourcing to make sense of the copious and complex data. I first implemented the concept of ``context slices'', which split up complex sensemaking tasks by context, to support meaningful division of work. I developed a web application, Connect the Dots, which generates relationship networks from text documents with crowdsourcing and context slices. Then I developed a crowd sensemaking pipeline based on the expert sensemaking process. I implemented the pipeline as a web platform, CrowdIA, which guides crowds to solve mysteries without expert intervention. Using the pipeline as a testbed, I probed the errors and bottlenecks in crowd sensemaking and provided design recommendations for crowd intelligence systems. Finally, I introduced the concept of ``crowd auditing'', in which an auditor examines a pipeline of crowd analyses and diagnoses the problems to steer a top-down path of the pipeline and refine the crowd analysis. The hope is that the crowd sensemaking pipeline can serve to accelerate research on sensemaking, and contribute to helping people conduct in-depth investigations of large collections of data.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:26959	en
dc.identifier.uri	http://hdl.handle.net/10919/99937	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Sensemaking	en
dc.subject	Text Analytics	en
dc.subject	Intelligence Analysis	en
dc.subject	Mystery Solving	en
dc.subject	Investigation	en
dc.subject	Crowdsourcing	en
dc.title	Solving Mysteries with Crowds: Supporting Crowdsourced Sensemaking with a Modularized Pipeline and Context Slices	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Li_T_D_2020.pdf
Size:: 12.8 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations