Practical Minimal Perfect Hashing Functions for Large Databases

Fox, Edward A.; Heath, Lenwood S.; Chen, Qi-Fan; Daoud, Amjad M.

Practical Minimal Perfect Hashing Functions for Large Databases

dc.contributor.author	Fox, Edward A.	en
dc.contributor.author	Heath, Lenwood S.	en
dc.contributor.author	Chen, Qi-Fan	en
dc.contributor.author	Daoud, Amjad M.	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2013-06-19T14:36:46Z	en
dc.date.available	2013-06-19T14:36:46Z	en
dc.date.issued	1990	en
dc.description.abstract	We describe the first practical algorithms for finding minimal perfect hash functions that have been used to access very large databases (i.e., having over 1 million keys). This method extends earlier work wherein an 0(n-cubed) algorithm was devised, building upon prior work by Sager that described an 0(n-to the fourth) algorithm. Our first linear expected time algorithm makes use of three key insights: applying randomness whereever possible, ordering our search for hash functions based on the degree of the vertices in a graph that represents word dependencies, and viewing hash value assignment in terms of adding circular patterns of related words to a partially filled disk. Our second algorithm builds functions that are slightly more complex, but does not build a word dependency graph and so approaches the theoretical lower bound on function specification size. While ultimately applicable to a wide variety of data and file access needs, these algorithms have already proven useful in aiding our work in improving the performance of CD-ROM systems and our construction of a Large External Network Database (LEND) for semantic networks and hypertext/hypermedia collections. Virginia Disc One includes a demonstration of a minimal perfect hash function running on a PC to access a 130,198 word list on that CD-ROM. Several other microcomputer, minicomputer, and parallel processor versions and applications of our algorithm have also been developed. Tests including those wiht a French word list of 420,878 entries and a library catalog key set with over 3.8 million keys have shown that our methods work with very large databases.	en
dc.format.mimetype	application/pdf	en
dc.identifier	http://eprints.cs.vt.edu/archive/00000223/	en
dc.identifier.sourceurl	http://eprints.cs.vt.edu/archive/00000223/01/TR-90-41.pdf	en
dc.identifier.trnumber	TR-90-41	en
dc.identifier.uri	http://hdl.handle.net/10919/19628	en
dc.language.iso	en	en
dc.publisher	Department of Computer Science, Virginia Polytechnic Institute & State University	en
dc.relation.ispartof	Historical Collection(Till Dec 2001)	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.title	Practical Minimal Perfect Hashing Functions for Large Databases	en
dc.type	Technical report	en
dc.type.dcmitype	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR-90-41.pdf
Size:: 1.53 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer Science Technical Reports