Text Clustering Using LucidWorks and Apache Mahout

Chen, Liangzhe; Lin, Xiao; Wood, Andrew

Text Clustering Using LucidWorks and Apache Mahout

dc.contributor	Virginia Tech. Digital Library Research Laboratory	en
dc.contributor	Virginia Tech. Department of Computer Science	en
dc.contributor.author	Chen, Liangzhe	en
dc.contributor.author	Lin, Xiao	en
dc.contributor.author	Wood, Andrew	en
dc.contributor.department	Digital Library Research Laboratory	en
dc.contributor.department	Computer Science	en
dc.contributor.editor	Fox, Edward A.	en
dc.contributor.editor	Chitturi, Kiran	en
dc.contributor.editor	Kanan, Tarek	en
dc.date.accessioned	2015-05-22T14:18:55Z	en
dc.date.available	2015-05-22T14:18:55Z	en
dc.date.issued	2012-11-17	en
dc.description.abstract	This module introduces algorithms and evaluation metrics for flat clustering. We focus on the usage of LucidWorks big data analysis software and Apache Mahout, an open source machine learning library in clustering of document collections with the k-means algorithm.	en
dc.description.notes	CS 5604: Information Storage and Retrieval	en
dc.format.extent	12 pages	en
dc.format.mimetype	application/pdf	en
dc.identifier.uri	http://hdl.handle.net/10919/52539	en
dc.identifier.url	http://curric.dlib.vt.edu/modDev/lucidworks_modules/CS5604F2012Module-LucidWorks-Clustering.pdf	en
dc.language.iso	en_US	en
dc.relation.ispartofseries	Digital Library Curriculum Project	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Computer science	en
dc.subject	Digital libraries	en
dc.subject	Text clustering	en
dc.subject	Lucidworks	en
dc.subject	Apache mahout	en
dc.title	Text Clustering Using LucidWorks and Apache Mahout	en
dc.type	Learning object	en
dc.type.dcmitype	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: CS5604F2012Module-LucidWorks-Clustering.pdf
Size:: 1.26 MB
Format:: Adobe Portable Document Format

Download

Collections

Learning Objects, Digital Library Research Laboratory