Collision Resolution Techniques
In hash tables, collisions occur when two different keys produce the same hash value. Handling collisions is an important aspect of hash table implementation to ensure efficient data storage and retrieval.
There are several collision resolution techniques that can be used to address this issue:
Chaining: Chaining is a simple and widely used technique where each bucket in the hash table contains a linked list of key-value pairs. When a collision occurs, the new key-value pair is appended to the linked list in the corresponding bucket. This allows multiple values with the same hash value to be stored in the same bucket.
TEXT/X-JAVA1// Java code example of chaining 2class HashTable { 3 private ListNode[] buckets; 4 private int size; 5 6 public HashTable(int capacity) { 7 buckets = new ListNode[capacity]; 8 size = capacity; 9 } 10 11 public void put(int key, int value) { 12 int index = key % size; 13 if (buckets[index] == null) { 14 buckets[index] = new ListNode(key, value); 15 } else { 16 ListNode curr = buckets[index]; 17 while (curr.next != null) { 18 curr = curr.next; 19 } 20 curr.next = new ListNode(key, value); 21 } 22 } 23 24 public int get(int key) { 25 int index = key % size; 26 ListNode curr = buckets[index]; 27 while (curr != null) { 28 if (curr.key == key) { 29 return curr.value; 30 } 31 curr = curr.next; 32 } 33 return -1; // Key not found 34 } 35 36 private static class ListNode { 37 int key; 38 int value; 39 ListNode next; 40 41 public ListNode(int key, int value) { 42 this.key = key; 43 this.value = value; 44 this.next = null; 45 } 46 } 47}
Chaining allows for efficient handling of collisions as each bucket can store multiple key-value pairs. However, it requires additional memory to store the linked lists and may result in increased lookup time if the lists become too long.
Open Addressing: Open addressing is an alternative collision resolution technique where all key-value pairs are stored in the hash table itself. When a collision occurs, the algorithm searches for an alternative empty slot in the hash table and stores the key-value pair there, following a predefined probing sequence.
TEXT/X-JAVA1// Java code example of open addressing - linear probing 2class HashTable { 3 private int[] keys; 4 private int[] values; 5 private int size; 6 7 public HashTable(int capacity) { 8 keys = new int[capacity]; 9 values = new int[capacity]; 10 size = capacity; 11 } 12 13 public void put(int key, int value) { 14 int index = key % size; 15 while (keys[index] != 0) { 16 index = (index + 1) % size; // Linear probing 17 } 18 keys[index] = key; 19 values[index] = value; 20 } 21 22 public int get(int key) { 23 int index = key % size; 24 while (keys[index] != key) { 25 index = (index + 1) % size; // Linear probing 26 if (keys[index] == 0) { 27 return -1; // Key not found 28 } 29 } 30 return values[index]; 31 } 32}
Open addressing eliminates the need for linked lists and reduces memory overhead. However, it can lead to clustering, where consecutive slots become occupied, resulting in a higher probability of subsequent collisions.
Robin Hood Hashing: Robin Hood hashing is a variant of open addressing that aims to minimize the distance between key-value pairs in the hash table. When a collision occurs, the new key-value pair replaces the existing pair only if the displacement (distance from the original slot) of the new pair is shorter than the existing pair. This helps maintain a more balanced distribution of key-value pairs.
TEXT/X-JAVA1// Java code example of Robin Hood hashing 2class HashTable { 3 private int[] keys; 4 private int[] values; 5 private int[] displacements; 6 private int size; 7 8 public HashTable(int capacity) { 9 keys = new int[capacity]; 10 values = new int[capacity]; 11 displacements = new int[capacity]; 12 size = capacity; 13 } 14 15 public void put(int key, int value) { 16 int index = key % size; 17 int displacement = 0; 18 while (keys[index] != 0 && displacements[index] >= displacement) { 19 int tempKey = keys[index]; 20 int tempValue = values[index]; 21 int tempDisplacement = displacements[index]; 22 keys[index] = key; 23 values[index] = value; 24 displacements[index] = displacement; 25 key = tempKey; 26 value = tempValue; 27 displacement = tempDisplacement + 1; 28 index = (index + 1) % size; // Linear probing 29 } 30 keys[index] = key; 31 values[index] = value; 32 displacements[index] = displacement; 33 } 34 35 public int get(int key) { 36 int index = key % size; 37 int displacement = 0; 38 while (keys[index] != key && displacements[index] >= displacement) { 39 displacement = displacements[index] + 1; 40 index = (index + 1) % size; // Linear probing 41 } 42 if (keys[index] == key) { 43 return values[index]; 44 } 45 return -1; // Key not found 46 } 47}
Robin Hood hashing provides a more even distribution of key-value pairs and reduces the average search time. However, it requires additional displacement information for each slot.
Each collision resolution technique has its own advantages and trade-offs. The choice of technique depends on the specific requirements and constraints of the application.
It's important to note that the efficiency of collision resolution techniques can be impacted by factors such as the choice of hash function, load factor, and table size.