Large Scale Network Visualization with Gephi

TR Number

Date

2012-12-11

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The notion of graphs or networks is sufficiently pervasive since it can be used to model various types of data sources. Social, biological, and other networks capture the underlying structural and relational properties. Analysis of different networks reveals interesting information of the corresponding domain or system. Network analysts, thus, strive to analyze various networks by applying different algorithms and try to connect obtained insights to make sense of a unified theme, pattern or structure. For example, analysis of facebook friend network of a person can reveal information such as, groups of highly clustered people, most influential person in terms of connections, connecting persons between different cluster of people, etc. While analyzing networks and digesting the information therein, analysts gradually form internal mental models of the people, places, events, or any sort of entity represented in the networks. As the number of nodes grows larger, however, it becomes increasingly difficult for an investigator to track the connections between data and make sense of it all. Many researchers believe that visual representations of data can help people better examine, analyze, and understand them. Norman [Norman94] has described how visual representations can help augment people’s thinking and analysis processes.

The objective of the project is to develop visual representations of nodes, edges, and labels of a network in order to help analysts search, review, and understand the network better. We seek to create interactive visualizations that will highlight and identify significance of nodes, cluster formation, etc., in the networks where entities may be, for example, people, places, webpage, biological entity, dates and organizations. Basically, we want to build visual representations of the networks that help analysts making sense by applying different algorithms on them and observe the difference of nodes and edges in terms of color, and size.

A very important aspect of the project is the integration of the visualization module with CINET [CINET2012], a cyberinfrastructure for network science. CINET includes a set of graph algorithms and various types of networks. Analysis of networks are done by applying algorithms on those networks; results are obtained as text files containing information of different measures of nodes or edges. Complex workflow is intended while working with CINET where output of one analysis can be used as input to further analysis. Visualization comes as a great aid when analyst want to filter his interest on some particular nodes or a portion of the graph and conduct subsequent analysis on the smaller part. Though there are some existing visualization tools, e.g., Jigsaw [Jigsaw08], Sentinel Visualizer, NetLens, etc., they are more focused on information representation rather than on graph exploration or summarization capabilities. To the best of our knowledge, our project is the only one which supports network visualization as a part of complex workflow within network analysis utilizing high performance computing environment. In summary, this project develops a visualization component for a VT digital library containing large network graphs (e.g., social networks and transportation networks). The visualization service will get datasets from an existing DL, visualize the graphs using Gephi (a java-based visualization library), and integrate the results within an NSF supported cyberinfrastructure (CINET).

Description

ProjCINETViz is accomplished by Maksudul Alam, S M Arifuzzaman, and Md. Hasanuzzaman Bhuiyan. The authors are grateful to Prof. Ed Fox for his generous suggestions and directions for improving the project and materials.

Keywords

network, visualization, CINET, Network Science, Large graphs

Citation