GraphCrowd: Harnessing the Crowd to Lay Out Graphs with Applications to Cellular Signaling Pathways
Abstract
Automated analysis of networks of interactions between proteins has become pervasive in
molecular biology. Each node in such a network represents a protein and each edge an inter-
action between two proteins. Nearly every publication that uses network analysis includes
a visualization of a graph in which the nodes and edges are laid out in two dimensions.
Several systems implement multiple types of graph layout algorithms and make them easily
accessible to scientists. Despite the existence of these systems, interdisciplinary research
teams in computational biology face several challenges in sharing computed networks and
interpreting them.
This thesis presents two systemsGraphSpace and GraphCrowdthat together enhance network-
based collaboration. GraphSpace users can automatically and rapidly share richly- anno-
tated networks, irrespective of the algorithms or software used to generate them. A user
may search for networks that contain a specific node or edge, or a collection of nodes and
edges. Users can manually modify a layout, save it, and share it with other users. Users
can create private groups, invite other users to join groups, and share networks with group
members. Upon publication, researchers may make networks public and provide a URL in
the paper.
GraphCrowd addresses the challenging posed by automated layout algorithms, which incor-
porate almost no knowledge of the biological information underlying the networks. These
algorithms compel researchers to use their knowledge and intuition to modify the node and
edge positions manually to bring out salient features. GraphCrowd focuses on signaling
networks, which connect proteins that represent a cells response to external signals. Treat-
ing network layout as a design problem, GraphCrowd explores the feasibility of leveraging
human computation via crowdsourcing to create simplified and meaningful visualizations.
GraphCrowd provides a streamlined interface that enables crowd workers to easily manip-
ulate networks to create layouts that follow a specific set of guidelines. GraphCrowd also
implements an interface to allow a user (e.g., an expert or a crowd worker) to evaluate how
well a layout conforms to the guidelines.
We use GraphCrowd to address two research questions: (i) Can we harness the power of
crowdsourcing to create simplified, biologically meaningful visualizations of signaling net-
works?(ii) Can crowd workers rate layouts similarly to how an expert with domain knowledge
would rate them? We design two systematic experiments that enable us to answer both ques-
tions in the affirmative. This thesis establishes crowdsourcing as a powerful methodology for
laying out complex signaling networks. Moreover, by developing appropriate domain-specific
guidelines for crowd workers, GraphCrowd can be generalized to a variety of applications.
Collections
- Masters Theses [18655]