Text Clustering Using LucidWorks and Apache Mahout

TR Number

Date

2012-11-17

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This module introduces algorithms and evaluation metrics for flat clustering. We focus on the usage of LucidWorks big data analysis software and Apache Mahout, an open source machine learning library in clustering of document collections with the k-means algorithm.

Description

Keywords

Computer science, Digital libraries, Text clustering, Lucidworks, Apache mahout

Citation