The analysis of double hashing

Leo J. Guibas, Endre Szemeredi

Research output: Contribution to journalArticle

42 Citations (Scopus)

Abstract

In this paper we analyze the performance of double hashing, a well-known hashing algorithm in which we probe the hash table along arithmetic progressions where the initial element and the increment of the progression are chosen randomly and independently depending only on the key K of the search. We prove that double hashing is asymptotically equivalent to uniform probing for load factors α not exceeding a certain constant α0 = 0.31.... Uniform hashing refers to a technique which exhibits no clustering and is known to be optimal in a certain sense. Our proof method has a different flavor from those previously used in algorithmic analysis. We begin by showing that the tail of the hypergeometric distribution a fixed percentage away from the mean is exponentially small. We use this result to prove that random subsets of the finite ring of integers modulo m of cardinality am have always nearly the expected number of arithmetic progressions of length k, except with exponentially small probability. We then use this theorem to start up a process (called the extension process) of looking at snapshorts of the table as it fills up with double hashing. Between steps of the extension process we can show that the effect of clustering is negligible, and that we therefore never depart too far from the truly random situation.

Original languageEnglish
Pages (from-to)226-274
Number of pages49
JournalJournal of Computer and System Sciences
Volume16
Issue number2
DOIs
Publication statusPublished - Apr 1978

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Networks and Communications
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint Dive into the research topics of 'The analysis of double hashing'. Together they form a unique fingerprint.

  • Cite this