Indexing Large Collections of Small Text Records for Ranked Retrieval

France, Robert K.; Fox, Edward A.

Indexing Large Collections of Small Text Records for Ranked Retrieval

Files

1993_Indexing_large_collections.pdf (569.37 KB)

Downloads: 202

Date

1993

Authors

France, Robert K.

Fox, Edward A.

Abstract

The MARIAN online public access catalog system at Virginia Tech has been developed to apply advanced information retrieval methods and object-oriented technology to the needs of library patrons. We give a description of our data model, design, processing, data representations, and retrieval operation. By identifying objects of interest during the indexing process, storing them according to our "information graph" model, and applying weighting schemes that seem appropriate for this large collection of small text records, we hope to better serve user needs. Since every text word is important in this domain, we employ opportunistic matching algorithms and a mix of data structures to support searching, that will give good performance for a large campus community, even though MARIAN runs on a distributed collection of small workstations. An initial small experiment indicates that our new ad hoc weighting scheme is more effective than a more standard approach.

Keywords

Indexing, Collection management, Ranked retrieval, Small text records

Citation

France, Robert K. and Edward A. Fox. "Indexing Large Collections of Small Text Records for Ranked Retrieval." Internal Report, Virginia Tech, 1993.

Persistent link

http://hdl.handle.net/10919/52849

Collections

Reports, Digital Library Research Laboratory

Full item page

Indexing Large Collections of Small Text Records for Ranked Retrieval

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections