Solving Mysteries with Crowds: Supporting Crowdsourced Sensemaking with a Modularized Pipeline and Context Slices
The increasing volume and complexity of text data are challenging the cognitive capabilities of expert analysts. Machine learning and crowdsourcing present new opportunities for large-scale sensemaking, but it remains a challenge to model the overall process so that many distributed agents can contribute to suitable components asynchronously and meaningfully. In this work, I explore how to crowdsource sensemaking for intelligence analysis. Specifically, I focus on the complex processes that include developing hypotheses and theories from a raw dataset and iteratively refining the analysis. I first developed Connect the Dots, a web application that implements the concept of "context slices" and supports novice crowds in building relationship networks for exploratory analysis. Then I developed CrowdIA, a software platform that implements the entire crowd sensemaking pipeline and the context slicing for each step, to enable unsupervised crowd sensemaking. Using the pipeline as a testbed, I probed the errors and bottlenecks in crowdsourced sensemaking,and suggested design recommendations for integrated crowdsourcing systems. Building on these insights and to support iterative crowd sensemaking, I developed the concept of "crowd auditing" in which an auditor examines a pipeline of crowd analyses and diagnoses the problems to steer future refinement. I explored the design space to support crowd auditing and developed CrowdTrace, a crowd auditing tool that enables novice auditors to effectively identify the important problems with the crowd analysis and create microtasks for crowd workers to fix the problems.The core contributions of this work include a pipeline that enables distributed crowd collaboration to holistic sensemaking processes, two novel concepts of "context slices" and "crowd auditing", web applications that support crowd sensemaking and auditing, as well as design implications for crowd sensemaking systems. The hope is that the crowd sensemaking pipeline can serve to accelerate research on sensemaking, and contribute to helping people conduct in-depth investigations of large collections of information.