Now showing items 1-3 of 3
Clustering and Topic Analysis in CS 5604 Information Retrieval Fall 2016
(Virginia Tech, 2016-12-08)
The IDEAL (Integrated Digital Event Archiving and Library) and Global Event and Trend Archive Research (GETAR) projects aim to build a robust Information Retrieval (IR) system by retrieving tweets and webpages from social ...
English Wikipedia on Hadoop Cluster
To develop and test big data software, one thing that is required is a big dataset. The full English Wikipedia dataset would serve well for testing and benchmarking purposes. Loading this dataset onto a system, such as an ...
Topic Analysis project in CS5604, Spring 2016: Extracting Topics from Tweets and Webpages for IDEAL
The IDEAL (Integrated Digital Event Archiving and Library) project aims to ingest tweets and web-based content from social media and the web and index it for retrieval. One of the required milestones for a graduate-level ...