Now showing items 1-2 of 2
English Wikipedia on Hadoop Cluster
To develop and test big data software, one thing that is required is a big dataset. The full English Wikipedia dataset would serve well for testing and benchmarking purposes. Loading this dataset onto a system, such as an ...
Computational Linguistic Analysis of Earthquake Collections
CS4984 is a newly-offered class at Virginia Tech with a unit based, project-problem based learning curriculum. This class style is based on NSF-funded work on curriculum for the field of digital libraries and related topics, ...