Open Addressing Collision Handling technique in Hashing
Last Updated :
12 May, 2025
Open Addressing is a method for handling collisions. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This approach is also known as closed hashing. This entire procedure is based upon probing. We will understand the types of probing ahead:
- Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
- Search(k): Keep probing until the slot's key doesn't become equal to k or an empty slot is reached.
- Delete(k): Delete operation is interesting. If we simply delete a key, then the search may fail. So slots of deleted keys are marked specially as "deleted".
The insert can insert an item in a deleted slot, but the search doesn't stop at a deleted slot.
Different ways of Open Addressing:
In linear probing, the hash table is searched sequentially that starts from the original location of the hash. If in case the location that we get is already occupied, then we check for the next location.
The function used for rehashing is as follows: rehash(key) = (n+1)%table-size.
For example, The typical gap between two probes is 1 as seen in the example below:
Let hash(x) be the slot index computed using a hash function and S be the table size
If slot hash(x) % S is full, then we try (hash(x) + 1) % S
If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) % S
If (hash(x) + 2) % S is also full, then we try (hash(x) + 3) % S
Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that are to be inserted are 50, 70, 76, 85, 93.
Implementation : Please refer Program to implement Hash Table using Open Addressing
2. Quadratic Probing
If you observe carefully, then you will understand that the interval between probes will increase proportionally to the hash value. Quadratic probing is a method with the help of which we can solve the problem of clustering that was discussed above. This method is also known as the mid-square method. In this method, we look for the i2'th slot in the ith iteration. We always start from the original hash location. If only the location is occupied then we check the other slots.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*1) % S
If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S
If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution strategy to be f(i) = i2 . Insert = 22, 30, and 50.
Implementation : Please refer Program for Quadratic Probing in Hashing
The intervals that lie between probes are computed by another hash function. Double hashing is a technique that reduces clustering in an optimized way. In this technique, the increments for the probing sequence are computed by using another hash function. We use another hash function hash2(x) and look for the i*hash2(x) slot in the ith rotation.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S
If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x)) % S
If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x)) % S
Example: Insert the keys 27, 43, 692, 72 into the Hash Table of size 7. where first hash-function is h1(k) = k mod 7 and second hash-function is h2(k) = 1 + (k mod 5)
Comparison of the above three:
Open addressing is a collision handling technique used in hashing where, when a collision occurs (i.e., when two or more keys map to the same slot), the algorithm looks for another empty slot in the hash table to store the collided key.
- In linear probing, the algorithm simply looks for the next available slot in the hash table and places the collided key there. If that slot is also occupied, the algorithm continues searching for the next available slot until an empty slot is found. This process is repeated until all collided keys have been stored. Linear probing has the best cache performance but suffers from clustering. One more advantage of Linear probing is easy to compute.
- In quadratic probing, the algorithm searches for slots in a more spaced-out manner. When a collision occurs, the algorithm looks for the next slot using an equation that involves the original hash value and a quadratic function. If that slot is also occupied, the algorithm increments the value of the quadratic function and tries again. This process is repeated until an empty slot is found. Quadratic probing lies between the two in terms of cache performance and clustering.
- In double hashing, the algorithm uses a second hash function to determine the next slot to check when a collision occurs. The algorithm calculates a hash value using the original hash function, then uses the second hash function to calculate an offset. The algorithm then checks the slot that is the sum of the original hash value and the offset. If that slot is occupied, the algorithm increments the offset and tries again. This process is repeated until an empty slot is found. Double hashing has poor cache performance but no clustering. Double hashing requires more computation time as two hash functions need to be computed.
The choice of collision handling technique can have a significant impact on the performance of a hash table. Linear probing is simple and fast, but it can lead to clustering (i.e., a situation where keys are stored in long contiguous runs) and can degrade performance. Quadratic probing is more spaced out, but it can also lead to clustering and can result in a situation where some slots are never checked. Double hashing is more complex, but it can lead to more even distribution of keys and can provide better performance in some cases.
S.No. | Separate Chaining | Open Addressing |
---|
1. | Chaining is Simpler to implement. | Open Addressing requires more computation. |
2. | In chaining, Hash table never fills up, we can always add more elements to chain. | In open addressing, table may become full. |
3. | Chaining is Less sensitive to the hash function or load factors. | Open addressing requires extra care to avoid clustering and load factor. |
4. | Chaining is mostly used when it is unknown how many and how frequently keys may be inserted or deleted. | Open addressing is used when the frequency and number of keys is known. |
5. | Cache performance of chaining is not good as keys are stored using linked list. | Open addressing provides better cache performance as everything is stored in the same table. |
6. | Wastage of Space (Some Parts of hash table in chaining are never used). | In Open addressing, a slot can be used even if an input doesn't map to it. |
7. | Chaining uses extra space for links. | No links in Open addressing |
Note: Cache performance of chaining is not good because when we traverse a Linked List, we are basically jumping from one node to another, all across the computer's memory. For this reason, the CPU cannot cache the nodes which aren't visited yet, this doesn't help us. But with Open Addressing, data isn't spread, so if the CPU detects that a segment of memory is constantly being accessed, it gets cached for quick access.
Like Chaining, the performance of hashing can be evaluated under the assumption that each key is equally likely to be hashed to any slot of the table (simple uniform hashing)
m = Number of slots in the hash table
n = Number of keys to be inserted in the hash table
Load factor α = n/m ( < 1 )
Expected time to search/insert/delete < 1/(1 - α)
So Search, Insert and Delete take (1/(1 - α)) time
Related Articles:
Hashing | Set 1 (Introduction)
Hashing | Set 2 (Separate Chaining)
Similar Reads
Hashing in Data Structure Hashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The
3 min read
Introduction to Hashing Hashing refers to the process of generating a small sized output (that can be used as index in a table) from an input of typically large and variable size. Hashing uses mathematical formulas known as hash functions to do the transformation. This technique determines an index or location for the stor
7 min read
What is Hashing? Hashing refers to the process of generating a fixed-size output from an input of variable size using the mathematical formulas known as hash functions. This technique determines an index or location for the storage of an item in a data structure. Need for Hash data structureThe amount of data on the
3 min read
Index Mapping (or Trivial Hashing) with negatives allowed Index Mapping (also known as Trivial Hashing) is a simple form of hashing where the data is directly mapped to an index in a hash table. The hash function used in this method is typically the identity function, which maps the input data to itself. In this case, the key of the data is used as the ind
7 min read
Separate Chaining Collision Handling Technique in Hashing Separate Chaining is a collision handling technique. Separate chaining is one of the most popular and commonly used techniques in order to handle collisions. In this article, we will discuss about what is Separate Chain collision handling technique, its advantages, disadvantages, etc.There are mainl
3 min read
Open Addressing Collision Handling technique in Hashing Open Addressing is a method for handling collisions. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This appro
7 min read
Double Hashing Double hashing is a collision resolution technique used in hash tables. It works by using two hash functions to compute two different hash values for a given key. The first hash function is used to compute the initial hash value, and the second hash function is used to compute the step size for the
15+ min read
Load Factor and Rehashing Prerequisites: Hashing Introduction and Collision handling by separate chaining How hashing works: For insertion of a key(K) - value(V) pair into a hash map, 2 steps are required: K is converted into a small integer (called its hash code) using a hash function.The hash code is used to find an index
15+ min read
Easy problems on Hashing
Check if an array is subset of another arrayGiven two arrays a[] and b[] of size m and n respectively, the task is to determine whether b[] is a subset of a[]. Both arrays are not sorted, and elements are distinct.Examples: Input: a[] = [11, 1, 13, 21, 3, 7], b[] = [11, 3, 7, 1] Output: trueInput: a[]= [1, 2, 3, 4, 5, 6], b = [1, 2, 4] Output
13 min read
Union and Intersection of two Linked List using HashingGiven two singly Linked Lists, create union and intersection lists that contain the union and intersection of the elements present in the given lists. Each of the two linked lists contains distinct node values.Note: The order of elements in output lists doesnât matter. Examples:Input:head1 : 10 -
10 min read
Two Sum - Pair with given SumGiven an array arr[] of n integers and a target value, the task is to find whether there is a pair of elements in the array whose sum is equal to target. This problem is a variation of 2Sum problem.Examples: Input: arr[] = [0, -1, 2, -3, 1], target = -2Output: trueExplanation: There is a pair (1, -3
15+ min read
Max Distance Between Two OccurrencesGiven an array arr[], the task is to find the maximum distance between two occurrences of any element. If no element occurs twice, return 0.Examples: Input: arr = [1, 1, 2, 2, 2, 1]Output: 5Explanation: distance for 1 is: 5-0 = 5, distance for 2 is: 4-2 = 2, So max distance is 5.Input : arr[] = [3,
8 min read
Most frequent element in an arrayGiven an array, the task is to find the most frequent element in it. If there are multiple elements that appear a maximum number of times, return the maximum element.Examples: Input : arr[] = [1, 3, 2, 1, 4, 1]Output : 1Explanation: 1 appears three times in array which is maximum frequency.Input : a
10 min read
Only Repeating From 1 To n-1Given an array arr[] of size n filled with numbers from 1 to n-1 in random order. The array has only one repetitive element. The task is to find the repetitive element.Examples:Input: arr[] = [1, 3, 2, 3, 4]Output: 3Explanation: The number 3 is the only repeating element.Input: arr[] = [1, 5, 1, 2,
15+ min read
Check for Disjoint Arrays or SetsGiven two arrays a and b, check if they are disjoint, i.e., there is no element common between both the arrays.Examples: Input: a[] = {12, 34, 11, 9, 3}, b[] = {2, 1, 3, 5} Output: FalseExplanation: 3 is common in both the arrays.Input: a[] = {12, 34, 11, 9, 3}, b[] = {7, 2, 1, 5} Output: True Expla
11 min read
Non-overlapping sum of two setsGiven two arrays A[] and B[] of size n. It is given that both array individually contains distinct elements. We need to find the sum of all elements that are not common.Examples: Input : A[] = {1, 5, 3, 8} B[] = {5, 4, 6, 7}Output : 291 + 3 + 4 + 6 + 7 + 8 = 29Input : A[] = {1, 5, 3, 8} B[] = {5, 1,
9 min read
Check if two arrays are equal or notGiven two arrays, a and b of equal length. The task is to determine if the given arrays are equal or not. Two arrays are considered equal if:Both arrays contain the same set of elements.The arrangements (or permutations) of elements may be different.If there are repeated elements, the counts of each
6 min read
Find missing elements of a rangeGiven an array, arr[0..n-1] of distinct elements and a range [low, high], find all numbers that are in a range, but not the array. The missing elements should be printed in sorted order.Examples: Input: arr[] = {10, 12, 11, 15}, low = 10, high = 15Output: 13, 14Input: arr[] = {1, 14, 11, 51, 15}, lo
15+ min read
Minimum Subsets with Distinct ElementsYou are given an array of n-element. You have to make subsets from the array such that no subset contain duplicate elements. Find out minimum number of subset possible.Examples : Input : arr[] = {1, 2, 3, 4}Output :1Explanation : A single subset can contains all values and all values are distinct.In
9 min read
Remove minimum elements such that no common elements exist in two arraysGiven two arrays arr1[] and arr2[] consisting of n and m elements respectively. The task is to find the minimum number of elements to remove from each array such that intersection of both arrays becomes empty and both arrays become mutually exclusive.Examples: Input: arr[] = { 1, 2, 3, 4}, arr2[] =
8 min read
2 Sum - Count pairs with given sumGiven an array arr[] of n integers and a target value, the task is to find the number of pairs of integers in the array whose sum is equal to target.Examples: Input: arr[] = {1, 5, 7, -1, 5}, target = 6Output: 3Explanation: Pairs with sum 6 are (1, 5), (7, -1) & (1, 5). Input: arr[] = {1, 1, 1,
9 min read
Count quadruples from four sorted arrays whose sum is equal to a given value xGiven four sorted arrays each of size n of distinct elements. Given a value x. The problem is to count all quadruples(group of four numbers) from all the four arrays whose sum is equal to x.Note: The quadruple has an element from each of the four arrays. Examples: Input : arr1 = {1, 4, 5, 6}, arr2 =
15+ min read
Sort elements by frequency | Set 4 (Efficient approach using hash)Print the elements of an array in the decreasing frequency if 2 numbers have the same frequency then print the one which came first. Examples: Input : arr[] = {2, 5, 2, 8, 5, 6, 8, 8} Output : arr[] = {8, 8, 8, 2, 2, 5, 5, 6} Input : arr[] = {2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8} Output : arr[] = {8,
12 min read
Find all pairs (a, b) in an array such that a % b = kGiven an array with distinct elements, the task is to find the pairs in the array such that a % b = k, where k is a given integer. You may assume that a and b are in small range Examples : Input : arr[] = {2, 3, 5, 4, 7} k = 3Output : (7, 4), (3, 4), (3, 5), (3, 7)7 % 4 = 33 % 4 = 33 % 5 = 33 % 7 =
15 min read
Group words with same set of charactersGiven a list of words with lower cases. Implement a function to find all Words that have the same unique character set. Example: Input: words[] = { "may", "student", "students", "dog", "studentssess", "god", "cat", "act", "tab", "bat", "flow", "wolf", "lambs", "amy", "yam", "balms", "looped", "poodl
8 min read
k-th distinct (or non-repeating) element among unique elements in an array.Given an integer array arr[], print kth distinct element in this array. The given array may contain duplicates and the output should print the k-th element among all unique elements. If k is more than the number of distinct elements, print -1.Examples:Input: arr[] = {1, 2, 1, 3, 4, 2}, k = 2Output:
7 min read
Intermediate problems on Hashing
Find Itinerary from a given list of ticketsGiven a list of tickets, find the itinerary in order using the given list.Note: It may be assumed that the input list of tickets is not cyclic and there is one ticket from every city except the final destination.Examples:Input: "Chennai" -> "Bangalore" "Bombay" -> "Delhi" "Goa" -> "Chennai"
11 min read
Find number of Employees Under every ManagerGiven a 2d matrix of strings arr[][] of order n * 2, where each array arr[i] contains two strings, where the first string arr[i][0] is the employee and arr[i][1] is his manager. The task is to find the count of the number of employees under each manager in the hierarchy and not just their direct rep
9 min read
Longest Subarray With Sum Divisible By KGiven an arr[] containing n integers and a positive integer k, he problem is to find the longest subarray's length with the sum of the elements divisible by k.Examples:Input: arr[] = [2, 7, 6, 1, 4, 5], k = 3Output: 4Explanation: The subarray [7, 6, 1, 4] has sum = 18, which is divisible by 3.Input:
10 min read
Longest Subarray with 0 Sum Given an array arr[] of size n, the task is to find the length of the longest subarray with sum equal to 0.Examples:Input: arr[] = {15, -2, 2, -8, 1, 7, 10, 23}Output: 5Explanation: The longest subarray with sum equals to 0 is {-2, 2, -8, 1, 7}Input: arr[] = {1, 2, 3}Output: 0Explanation: There is n
10 min read
Longest Increasing consecutive subsequenceGiven N elements, write a program that prints the length of the longest increasing consecutive subsequence. Examples: Input : a[] = {3, 10, 3, 11, 4, 5, 6, 7, 8, 12} Output : 6 Explanation: 3, 4, 5, 6, 7, 8 is the longest increasing subsequence whose adjacent element differs by one. Input : a[] = {6
10 min read
Count Distinct Elements In Every Window of Size KGiven an array arr[] of size n and an integer k, return the count of distinct numbers in all windows of size k. Examples: Input: arr[] = [1, 2, 1, 3, 4, 2, 3], k = 4Output: [3, 4, 4, 3]Explanation: First window is [1, 2, 1, 3], count of distinct numbers is 3. Second window is [2, 1, 3, 4] count of d
10 min read
Design a data structure that supports insert, delete, search and getRandom in constant timeDesign a data structure that supports the following operations in O(1) time.insert(x): Inserts an item x to the data structure if not already present.remove(x): Removes item x from the data structure if present. search(x): Searches an item x in the data structure.getRandom(): Returns a random elemen
5 min read
Subarray with Given Sum - Handles Negative NumbersGiven an unsorted array of integers, find a subarray that adds to a given number. If there is more than one subarray with the sum of the given number, print any of them.Examples: Input: arr[] = {1, 4, 20, 3, 10, 5}, sum = 33Output: Sum found between indexes 2 and 4Explanation: Sum of elements betwee
13 min read
Implementing our Own Hash Table with Separate Chaining in JavaAll data structure has their own special characteristics, for example, a BST is used when quick searching of an element (in log(n)) is required. A heap or a priority queue is used when the minimum or maximum element needs to be fetched in constant time. Similarly, a hash table is used to fetch, add
10 min read
Implementing own Hash Table with Open Addressing Linear ProbingIn Open Addressing, all elements are stored in the hash table itself. So at any point, size of table must be greater than or equal to total number of keys (Note that we can increase table size by copying old data if needed).Insert(k) - Keep probing until an empty slot is found. Once an empty slot is
13 min read
Maximum possible difference of two subsets of an arrayGiven an array of n-integers. The array may contain repetitive elements but the highest frequency of any element must not exceed two. You have to make two subsets such that the difference of the sum of their elements is maximum and both of them jointly contain all elements of the given array along w
15+ min read
Sorting using trivial hash functionWe have read about various sorting algorithms such as heap sort, bubble sort, merge sort and others. Here we will see how can we sort N elements using a hash array. But this algorithm has a limitation. We can sort only those N elements, where the value of elements is not large (typically not above 1
15+ min read
Smallest subarray with k distinct numbersWe are given an array consisting of n integers and an integer k. We need to find the smallest subarray [l, r] (both l and r are inclusive) such that there are exactly k different numbers. If no such subarray exists, print -1 and If multiple subarrays meet the criteria, return the one with the smalle
14 min read