GraphCrowd: Harnessing the Crowd to Lay Out Graphs with Applications to Cellular Signaling Pathways
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Automated analysis of networks of interactions between proteins has become pervasive in molecular biology. Each node in such a network represents a protein and each edge an interaction between two proteins. Nearly every publication that uses network analysis includes a visualization of a graph in which the nodes and edges are laid out in two dimensions. Several systems implement multiple types of graph layout algorithms and make them easily accessible to scientists. Despite the existence of these systems, interdisciplinary research teams in computational biology face several challenges in sharing computed networks and interpreting them.
This thesis presents two systemsGraphSpace and GraphCrowdthat together enhance network-based collaboration. GraphSpace users can automatically and rapidly share richly- annotated networks, irrespective of the algorithms or software used to generate them. A user may search for networks that contain a specific node or edge, or a collection of nodes and edges. Users can manually modify a layout, save it, and share it with other users. Users can create private groups, invite other users to join groups, and share networks with group members. Upon publication, researchers may make networks public and provide a URL in the paper.
GraphCrowd addresses the challenging posed by automated layout algorithms, which incorporate almost no knowledge of the biological information underlying the networks. These algorithms compel researchers to use their knowledge and intuition to modify the node and edge positions manually to bring out salient features. GraphCrowd focuses on signaling networks, which connect proteins that represent a cells response to external signals. Treating network layout as a design problem, GraphCrowd explores the feasibility of leveraging human computation via crowdsourcing to create simplified and meaningful visualizations. GraphCrowd provides a streamlined interface that enables crowd workers to easily manipulate networks to create layouts that follow a specific set of guidelines. GraphCrowd also implements an interface to allow a user (e.g., an expert or a crowd worker) to evaluate how well a layout conforms to the guidelines.
We use GraphCrowd to address two research questions: (i) Can we harness the power of crowdsourcing to create simplified, biologically meaningful visualizations of signaling networks?(ii) Can crowd workers rate layouts similarly to how an expert with domain knowledge would rate them? We design two systematic experiments that enable us to answer both questions in the affirmative. This thesis establishes crowdsourcing as a powerful methodology for laying out complex signaling networks. Moreover, by developing appropriate domain-specific guidelines for crowd workers, GraphCrowd can be generalized to a variety of applications.