Recent Developments in Document Clustering

Files
TR Number
TR-07-35
Date
2007-10-01
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science, Virginia Polytechnic Institute & State University
Abstract

This report aims to give a brief overview of the current state of document clustering research and present recent developments in a well-organized manner. Clustering algorithms are considered with two hypothetical scenarios in mind: online query clustering with tight efficiency constraints, and offline clustering with an emphasis on accuracy. A comparative analysis of the algorithms is performed along with a table summarizing important properties, and open problems as well as directions for future research are discussed.

Description
Keywords
Algorithms, Data structures
Citation