### Abstract

In this paper we analyze the performance of double hashing, a well-known hashing algorithm in which we probe the hash table along arithmetic progressions where the initial element and the increment of the progression are chosen randomly and independently depending only on the key K of the search. We prove that double hashing is asymptotically equivalent to uniform probing for load factors α not exceeding a certain constant α_{0} = 0.31.... Uniform hashing refers to a technique which exhibits no clustering and is known to be optimal in a certain sense. Our proof method has a different flavor from those previously used in algorithmic analysis. We begin by showing that the tail of the hypergeometric distribution a fixed percentage away from the mean is exponentially small. We use this result to prove that random subsets of the finite ring of integers modulo m of cardinality am have always nearly the expected number of arithmetic progressions of length k, except with exponentially small probability. We then use this theorem to start up a process (called the extension process) of looking at snapshorts of the table as it fills up with double hashing. Between steps of the extension process we can show that the effect of clustering is negligible, and that we therefore never depart too far from the truly random situation.

Original language | English |
---|---|

Pages (from-to) | 226-274 |

Number of pages | 49 |

Journal | Journal of Computer and System Sciences |

Volume | 16 |

Issue number | 2 |

DOIs | |

Publication status | Published - 1978 |

### Fingerprint

### ASJC Scopus subject areas

- Computational Theory and Mathematics

### Cite this

*Journal of Computer and System Sciences*,

*16*(2), 226-274. https://doi.org/10.1016/0022-0000(78)90046-6

**The analysis of double hashing.** / Guibas, Leo J.; Szemerédi, E.

Research output: Contribution to journal › Article

*Journal of Computer and System Sciences*, vol. 16, no. 2, pp. 226-274. https://doi.org/10.1016/0022-0000(78)90046-6

}

TY - JOUR

T1 - The analysis of double hashing

AU - Guibas, Leo J.

AU - Szemerédi, E.

PY - 1978

Y1 - 1978

N2 - In this paper we analyze the performance of double hashing, a well-known hashing algorithm in which we probe the hash table along arithmetic progressions where the initial element and the increment of the progression are chosen randomly and independently depending only on the key K of the search. We prove that double hashing is asymptotically equivalent to uniform probing for load factors α not exceeding a certain constant α0 = 0.31.... Uniform hashing refers to a technique which exhibits no clustering and is known to be optimal in a certain sense. Our proof method has a different flavor from those previously used in algorithmic analysis. We begin by showing that the tail of the hypergeometric distribution a fixed percentage away from the mean is exponentially small. We use this result to prove that random subsets of the finite ring of integers modulo m of cardinality am have always nearly the expected number of arithmetic progressions of length k, except with exponentially small probability. We then use this theorem to start up a process (called the extension process) of looking at snapshorts of the table as it fills up with double hashing. Between steps of the extension process we can show that the effect of clustering is negligible, and that we therefore never depart too far from the truly random situation.

AB - In this paper we analyze the performance of double hashing, a well-known hashing algorithm in which we probe the hash table along arithmetic progressions where the initial element and the increment of the progression are chosen randomly and independently depending only on the key K of the search. We prove that double hashing is asymptotically equivalent to uniform probing for load factors α not exceeding a certain constant α0 = 0.31.... Uniform hashing refers to a technique which exhibits no clustering and is known to be optimal in a certain sense. Our proof method has a different flavor from those previously used in algorithmic analysis. We begin by showing that the tail of the hypergeometric distribution a fixed percentage away from the mean is exponentially small. We use this result to prove that random subsets of the finite ring of integers modulo m of cardinality am have always nearly the expected number of arithmetic progressions of length k, except with exponentially small probability. We then use this theorem to start up a process (called the extension process) of looking at snapshorts of the table as it fills up with double hashing. Between steps of the extension process we can show that the effect of clustering is negligible, and that we therefore never depart too far from the truly random situation.

UR - http://www.scopus.com/inward/record.url?scp=0003640232&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0003640232&partnerID=8YFLogxK

U2 - 10.1016/0022-0000(78)90046-6

DO - 10.1016/0022-0000(78)90046-6

M3 - Article

AN - SCOPUS:0003640232

VL - 16

SP - 226

EP - 274

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

SN - 0022-0000

IS - 2

ER -