Now showing items 1-6 of 6
Computational Linguistic Analysis of Earthquake Collections
CS4984 is a newly-offered class at Virginia Tech with a unit based, project-problem based learning curriculum. This class style is based on NSF-funded work on curriculum for the field of digital libraries and related topics, ...
English Wikipedia on Hadoop Cluster
To develop and test big data software, one thing that is required is a big dataset. The full English Wikipedia dataset would serve well for testing and benchmarking purposes. Loading this dataset onto a system, such as an ...
NRV Tweets and RSS feeds
The goal of this project was to associate existing data in the Virtual Town Square database from the New River Valley area with topical metadata. We took a database of approximately 360,000 tweets and 15,000 RSS news stories ...
Topic Analysis project in CS5604, Spring 2016: Extracting Topics from Tweets and Webpages for IDEAL
The IDEAL (Integrated Digital Event Archiving and Library) project aims to ingest tweets and web-based content from social media and the web and index it for retrieval. One of the required milestones for a graduate-level ...
Exploring the Blacksburg Community Events Collection
With the advent of new technology, especially the combination of smart phones and widespread Internet access, people are increasingly becoming absorbed in digital worlds – worlds that are not bounded by geography. As such, ...
The purpose of this project is to assist the VTTI in converting a large citation file into a CSV file for ease of access. It required us to develop an application which can parse through a text file of citations, and ...