Efficient data structures for information retrieval

dc.contributor.authorDaoud, Amjad M.en
dc.contributor.committeechairFox, Edward A.en
dc.contributor.committeememberHeath, Lenwood S.en
dc.contributor.committeememberKafura, Dennis G.en
dc.contributor.committeememberShaffer, Clifford A.en
dc.contributor.committeememberBrown, Ezra A.en
dc.contributor.departmentComputer Science and Applicationsen
dc.date.accessioned2014-03-14T21:21:48Zen
dc.date.adate2005-10-20en
dc.date.available2014-03-14T21:21:48Zen
dc.date.issued1993-08-05en
dc.date.rdate2005-10-20en
dc.date.sdate2005-10-20en
dc.description.abstractThis dissertation deals with the application of efficient data structures and hashing algorithms to the problems of textual information storage and retrieval. We have developed static and dynamic techniques for handling large dictionaries, inverted lists, and optimizations applied to ranking algorithms. We have carried out an experiment called REVTOLC that demonstrated the efficiency and applicability of our algorithms and data structures. Also, the REVTOLC experiment revealed the effectiveness and ease of use of advanced information retrieval methods, namely extended Boolean (p-norm), vector, and vector with probabilistic feedback methods. We have developed efficient static and dynamic data structures and linear algorithms to find a class of minimal perfect hash functions for the efficient implementation of dictionaries, inverted lists, and stop lists. Further, we have developed a linear algorithm that produces order preserving minimal perfect hash functions. These data structures and algorithms enable much faster indexing of textual data and faster retrieval of best match documents using advanced information retrieval methods. Finally, we summarize our research findings and some open problems that are worth further investigation.en
dc.description.degreePh. D.en
dc.format.extentxiv, 183 leavesen
dc.format.mediumBTDen
dc.format.mimetypeapplication/pdfen
dc.identifier.otheretd-10202005-102821en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-10202005-102821/en
dc.identifier.urihttp://hdl.handle.net/10919/40031en
dc.language.isoenen
dc.publisherVirginia Techen
dc.relation.haspartLD5655.V856_1993.D368.pdfen
dc.relation.isformatofOCLC# 29179633en
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subject.lccLD5655.V856 1993.D368en
dc.subject.lcshData structures (Computer science)en
dc.subject.lcshInformation storage and retrieval systemsen
dc.titleEfficient data structures for information retrievalen
dc.typeDissertationen
dc.type.dcmitypeTexten
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LD5655.V856_1993.D368.pdf
Size:
25.43 MB
Format:
Adobe Portable Document Format
Description: