AlgoDaily - Hash Tables

Home > Advanced data structures and algorithms > Advanced data structures and algorithms > Hash Tables

Collision Resolution Techniques

In hash tables, collisions occur when two different keys produce the same hash value. Handling collisions is an important aspect of hash table implementation to ensure efficient data storage and retrieval.

There are several collision resolution techniques that can be used to address this issue:

Chaining: Chaining is a simple and widely used technique where each bucket in the hash table contains a linked list of key-value pairs. When a collision occurs, the new key-value pair is appended to the linked list in the corresponding bucket. This allows multiple values with the same hash value to be stored in the same bucket.

TEXT/X-JAVA

1// Java code example of chaining
2class HashTable {
3  private ListNode[] buckets;
4  private int size;
5
6  public HashTable(int capacity) {
7    buckets = new ListNode[capacity];
8    size = capacity;
9  }
10
11  public void put(int key, int value) {
12    int index = key % size;
13    if (buckets[index] == null) {
14      buckets[index] = new ListNode(key, value);
15    } else {
16      ListNode curr = buckets[index];
17      while (curr.next != null) {
18        curr = curr.next;
19      }
20      curr.next = new ListNode(key, value);
21    }
22  }
23
24  public int get(int key) {
25    int index = key % size;
26    ListNode curr = buckets[index];
27    while (curr != null) {
28      if (curr.key == key) {
29        return curr.value;
30      }
31      curr = curr.next;
32    }
33    return -1; // Key not found
34  }
35
36  private static class ListNode {
37    int key;
38    int value;
39    ListNode next;
40
41    public ListNode(int key, int value) {
42      this.key = key;
43      this.value = value;
44      this.next = null;
45    }
46  }
47}

Chaining allows for efficient handling of collisions as each bucket can store multiple key-value pairs. However, it requires additional memory to store the linked lists and may result in increased lookup time if the lists become too long.

Open Addressing: Open addressing is an alternative collision resolution technique where all key-value pairs are stored in the hash table itself. When a collision occurs, the algorithm searches for an alternative empty slot in the hash table and stores the key-value pair there, following a predefined probing sequence.

TEXT/X-JAVA

1// Java code example of open addressing - linear probing
2class HashTable {
3  private int[] keys;
4  private int[] values;
5  private int size;
6
7  public HashTable(int capacity) {
8    keys = new int[capacity];
9    values = new int[capacity];
10    size = capacity;
11  }
12
13  public void put(int key, int value) {
14    int index = key % size;
15    while (keys[index] != 0) {
16      index = (index + 1) % size; // Linear probing
17    }
18    keys[index] = key;
19    values[index] = value;
20  }
21
22  public int get(int key) {
23    int index = key % size;
24    while (keys[index] != key) {
25      index = (index + 1) % size; // Linear probing
26      if (keys[index] == 0) {
27        return -1; // Key not found
28      }
29    }
30    return values[index];
31  }
32}

Open addressing eliminates the need for linked lists and reduces memory overhead. However, it can lead to clustering, where consecutive slots become occupied, resulting in a higher probability of subsequent collisions.

Robin Hood Hashing: Robin Hood hashing is a variant of open addressing that aims to minimize the distance between key-value pairs in the hash table. When a collision occurs, the new key-value pair replaces the existing pair only if the displacement (distance from the original slot) of the new pair is shorter than the existing pair. This helps maintain a more balanced distribution of key-value pairs.

TEXT/X-JAVA

1// Java code example of Robin Hood hashing
2class HashTable {
3  private int[] keys;
4  private int[] values;
5  private int[] displacements;
6  private int size;
7
8  public HashTable(int capacity) {
9    keys = new int[capacity];
10    values = new int[capacity];
11    displacements = new int[capacity];
12    size = capacity;
13  }
14
15  public void put(int key, int value) {
16    int index = key % size;
17    int displacement = 0;
18    while (keys[index] != 0 && displacements[index] >= displacement) {
19      int tempKey = keys[index];
20      int tempValue = values[index];
21      int tempDisplacement = displacements[index];
22      keys[index] = key;
23      values[index] = value;
24      displacements[index] = displacement;
25      key = tempKey;
26      value = tempValue;
27      displacement = tempDisplacement + 1;
28      index = (index + 1) % size; // Linear probing
29    }
30    keys[index] = key;
31    values[index] = value;
32    displacements[index] = displacement;
33  }
34
35  public int get(int key) {
36    int index = key % size;
37    int displacement = 0;
38    while (keys[index] != key && displacements[index] >= displacement) {
39      displacement = displacements[index] + 1;
40      index = (index + 1) % size; // Linear probing
41    }
42    if (keys[index] == key) {
43      return values[index];
44    }
45    return -1; // Key not found
46  }
47}

Robin Hood hashing provides a more even distribution of key-value pairs and reduces the average search time. However, it requires additional displacement information for each slot.

Each collision resolution technique has its own advantages and trade-offs. The choice of technique depends on the specific requirements and constraints of the application.

It's important to note that the efficiency of collision resolution techniques can be impacted by factors such as the choice of hash function, load factor, and table size.

Collision Resolution Techniques

Programming Categories

Popular Lessons