Building the CODER Lexicon: The Collins English Dictionary and its Adverb Definitions

The CODER (COmposite Document Expert/extended/effective Retrieval) project is an investigation of the applicability of artificial intelligence techniques to the information retrieval task of analyzing, storing, and retrieving heterogeneous collections of "composite documents. "In order to support some of the processing desired, and to allow experimentation in information retrieval and natural language processing, a lexicon was constructed from the machine readable Collins Dictionary of the English Language. After giving background, motivation, and a survey of related work, the Collins lexicon is discussed. Following is a description of the conversion process, the format of the resulting Prolog database, and characteristics of the dictionary and relations. To illustrate what is present and to explain how it relates to the files produced from Webster's Seventh New Collegiate Dictionary, a number of comparative charts are given. Finally, a grammar for adverb definitions is presented, together with a description of defining formula that usually indicate the type of the adverb. Ultimately it is hoped that definitions for adverbs and other words will be parsed so that the relational lexicon being constructed will include many additional relationships and other knowledge about words and their usage.



CODER, Lexicons, Information retrieval, Data structures, Natural language interfaces, Relation systems, Language parsing and understanding, Text analysis


Fox, Edward A., Robert C. Wohlwend, Phyllis R. Sheldon, Qi-Fan Chen and Robert K. France. "Building the CODER Lexicon: The Collins English Dictionary and Its Adverb Definitions." Technical Report TR-86-23. Blacksburg, VA: Virginia Tech Department of Computer Science, October 1986.