Recent Developments in Document Clustering

Files

TR Number

TR-07-35

Date

2007-10-01

Journal Title

Journal ISSN

Volume Title

Publisher

Department of Computer Science, Virginia Polytechnic Institute & State University

Abstract

This report aims to give a brief overview of the current state of document clustering research and present recent developments in a well-organized manner. Clustering algorithms are considered with two hypothetical scenarios in mind: online query clustering with tight efficiency constraints, and offline clustering with an emphasis on accuracy. A comparative analysis of the algorithms is performed along with a table summarizing important properties, and open problems as well as directions for future research are discussed.

Description

Keywords

Algorithms, Data structures

Citation