An Artificial Intelligence Environment for Information Retrieval Research

Files

TR-88-10.pdf (9.29 MB)
Downloads: 1325

TR Number

TR-88-10

Date

1988

Journal Title

Journal ISSN

Volume Title

Publisher

Department of Computer Science, Virginia Polytechnic Institute & State University

Abstract

The CODER (COmposite Document Expert/Extended/Effective Retrieval) project is a multi-year effort to investigate how best to apply artificial intelligence methods to increase the effectiveness of information retrieval systems. Particular attention is being given to analysis and representation of heterogeneous documents, such as electronic mail digests or messages, which vary widely in style, length, topic,and structure. In order to ensure system adaptability and to allow reconfiguration for controlled experimentation, the project has been designed as a moderated expert system. This thesis covers the design problems involved in providing a unified architecture and knowledge representation scheme for such a system, and the solutions chosen for CODER. An overall object-oriented environment is constructed using a set of message-passing primitives based on a modified Prolog call paradigm. Within this environment is embedded the skeleton of a flexible expert system, where task decomposition is performed in a knowledge-oriented fashion and where subtask managers are implemented as members of a community of experts. A three-level knowledge representation formalism of elementary data types, frames, and relations is provided, and can be used to construct knowledge structures such as terms, meaning structures, and document interpretations. The use of individually tailored specialist experts coupled with standardized blackboard modules for communication and control and external knowledge bases for maintenance of factual world knowledge allows for quick prototyping, incremental development and flexibility under change. The system as a whole is structured as a set of communicating modules, defined functionally and implemented under UNIX^TM using sockets and the TCP/IP protocol for communication. Inferential modules are being coded in MU-Prolog; non-inferential modules are being prototyped in MU-Prolog and will be re-implemented as needed in C++.

Description

Keywords

Citation