The Academy: A Community of Information Retrieval Agents

TR Number

Date

1994-09-06

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

We commonly picture text as a sequence of words; or alternatively as a sequence of paragraphs, each of which is composed of a sequence of sentences, each of which is itself a sequence of words. It is also worth noting that text is not so much a sequence of words as a sequence of terms, including most commonly words, but also including names, numbers, code sequences, and a variety of other $#*&)&@^ tokens. Just as we commonly simplify text into a sequence of words, so too it is common in information retrieval to regard documents as single texts. Nothing is less common, though, than a document with only a single part, and that unstructured text. Search and retrieval in such a universe involves new questions: Where does a document begin and end? How can we decide how much to show to a user? When does a query need to be matched by a single node in a hypertext, and when may partial matches in several nodes count?

Description

Keywords

Information retrieval, Information retrieval

Citation

France, Robert. "The Academy: A Community of Information Retrieval Agents." Draft internal report (0.1), 1994.