A Discovery Portal for Twitter Collections
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This report documents the continuation of a project begun by previous students three years ago in 2021. About six billion Tweets have been collected in three formats, Social Feed Manager (SFM), yourTwapperKeeper (YTK), and Digital Methods Initiative Twitter Capture and Analysis Toolset (DMI-TCAT), by the Digital Library Research Laboratory (DLRL) at Virginia Tech. The overall goal of this project is to organize these Tweets into event collections and consolidate the collection information that is stored in three different schemas and databases into one web app, making the data more accessible. In Fall 2021, the Library6BTweet team designed an individual Tweet and collection-level Tweet schema. They also worked on converting Tweet data. In Spring 2022, the Twitter Collections team optimized the conversion scripts, converted Tweet data, and looked into implementing a machine learning model to categorize Tweets. In Spring 2024, the Twitter Database Discovery Portal team consolidated the collected data into a local mongo database and built a web app with minimal features that display the collected data and allows the user to search and filter the collections. The Twitter Database Discovery Portal team did not complete extracting the data from the SFM database. Our team’s goal is to build upon the past team’s contributions to finish extracting the data from the SFM database and add new features to the web app.