Twitter Metadata

Abstract

A number of projects and research efforts work with collections of tweets. Of particular interest is the collection of tweets related to world events. Many organizations have their own individual tweet collections regarding specific events; however, there is currently no effective support for collaboration. Metadata standards foster collaboration by allowing groups to adhere to a unified format so they can seamlessly inter-operate. In part one of the Twitter Metadata project, I define a tweet-level metadata standard that leverages the Twitter API format, as well as a collection-level metadata standard which combines Dublin Core and PROV-O. By combining two diverse existing standards (Dublin Core and PROV-O) into an RDF based specification, the proposed standard is able to capture both the descriptive metadata as well as provenance of the collections. In part two of the Twitter Metadata project, I create a tool called TweetID in order to further foster collaboration with tweet collections. TweetID is a web application that allows its users to upload tweet collections. TweetID extracts, and provides an interface to, the underlying tweet-level and collection-level metadata. Furthermore, TweetID also provides the ability to merge multiple collections together, allowing researchers to compare their collections to others’, as well as potentially augment their event collections for higher recall.

Description
Twitter Metadata was performed in collaboration with the project client, Mohamed Magdy (mmagdy@vt.edu), and makes use of work done under NSF IIS - 1319578: Integrated Digital Event Archiving and Library (IDEAL). Instructor involvement also relates to support through Qatar National Research Fund Project No. NPRP 4-029-1-007 and collaboration with QCRI. The file, Twitter Metadata - TweetID Demo.mp4, is a demo video which shows the various functionality of TweetID. The file, Twitter Metadata - TweetID Code.zip, contains all of the code that was written for TweetID.
Keywords
Twitter, Metadata Standard, Event Collections
Citation