Read-Agree-Predict: A Crowdsourced Approach to Discovering Relevant Primary Sources for Historians
Historians spend significant time looking for relevant, high-quality primary sources in digitized archives and through web searches. One reason this task is time-consuming is that historians’ research interests are often highly abstract and specialized. These topics are unlikely to be manually indexed and are difficult to identify with automated text analysis techniques. In this article, we investigate the potential of a new crowdsourcing model in which the historian delegates to a novice crowd the task of labeling the relevance of primary sources with respect to her unique research interests. The model employs a novel crowd workflow, Read-Agree-Predict (RAP), that allows novice crowd workers to label relevance as well as expert historians. As a useful byproduct, RAP also reveals and prioritizes crowd confusions as targeted learning opportunities. We demonstrate the value of our model with two experiments with paid crowd workers (n=170), with the future goal of extending our work to classroom students and public history interventions. We also discuss broader implications for historical research and education.