Day 23

Day 23: Hash Tables

23/60 Days

Hash Tables #

Welcome to Day 23 of our 60 Days of Coding Algorithm Challenge! Today, we’ll dive into Hash Tables, a fundamental data structure that provides efficient insertion, deletion, and lookup operations. Hash tables are widely used in various applications due to their average-case constant time complexity for these operations.

What is a Hash Table? #

A hash table is a data structure that implements an associative array abstract data type, a structure that can map keys to values. It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

How Hash Tables Work #

Hash Function: Converts keys into array indices.
Collision Resolution: Handles cases where two keys hash to the same index.

Hash Functions #

A good hash function should:

Be deterministic: same input always produces the same output.
Distribute keys uniformly across the array.
Be efficient to compute.

Example of a simple hash function:

1def simple_hash(key, table_size):
2    return sum(ord(char) for char in str(key)) % table_size

Collision Resolution Techniques #

1. Chaining #

In chaining, each bucket is …

Hash Tables #

What is a Hash Table? #

How Hash Tables Work #

Hash Function: Converts keys into array indices.
Collision Resolution: Handles cases where two keys hash to the same index.

Hash Functions #

A good hash function should:

Be deterministic: same input always produces the same output.
Distribute keys uniformly across the array.
Be efficient to compute.

Example of a simple hash function:

1def simple_hash(key, table_size):
2    return sum(ord(char) for char in str(key)) % table_size

Collision Resolution Techniques #

1. Chaining #

In chaining, each bucket is independent, and has some sort of list of entries with the same index. The time for hash table operations is the time to find the bucket (constant) plus the time for the list operation.

2. Open Addressing #

In open addressing, all entry records are stored in the bucket array itself. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.

Implementing a Hash Table with Chaining #

Let’s implement a basic hash table using chaining for collision resolution:

 1class HashTable:
 2    def __init__(self, size=10):
 3        self.size = size
 4        self.table = [[] for _ in range(self.size)]
 5
 6    def _hash(self, key):
 7        return hash(key) % self.size
 8
 9    def insert(self, key, value):
10        index = self._hash(key)
11        for item in self.table[index]:
12            if item[0] == key:
13                item[1] = value
14                return
15        self.table[index].append([key, value])
16
17    def get(self, key):
18        index = self._hash(key)
19        for item in self.table[index]:
20            if item[0] == key:
21                return item[1]
22        raise KeyError(key)
23
24    def remove(self, key):
25        index = self._hash(key)
26        for i, item in enumerate(self.table[index]):
27            if item[0] == key:
28                del self.table[index][i]
29                return
30        raise KeyError(key)
31
32    def __str__(self):
33        return str(self.table)
34
35# Example usage
36ht = HashTable()
37ht.insert("apple", 5)
38ht.insert("banana", 7)
39ht.insert("orange", 3)
40
41print(ht)
42print(ht.get("banana"))  # Output: 7
43
44ht.remove("banana")
45print(ht)
46
47try:
48    ht.get("banana")
49except KeyError:
50    print("KeyError: 'banana' not found")

Time Complexity #

For a good hash function and a reasonable load factor:

Insert: O(1) average case, O(n) worst case
Delete: O(1) average case, O(n) worst case
Search: O(1) average case, O(n) worst case

Where n is the number of key-value pairs in the hash table.

Load Factor and Resizing #

The load factor of a hash table is the ratio of the number of stored elements to the size of the hash table. As the load factor increases, the probability of collisions increases.

To maintain performance, the hash table should be resized when the load factor exceeds a certain threshold (typically 0.7 or 0.75). Resizing involves creating a new, larger array and rehashing all existing elements.

Applications of Hash Tables #

Database Indexing: For quick data retrieval.
Caches: To store and retrieve data quickly.
Symbol Tables: In compilers and interpreters.
Associative Arrays: Implementation in many programming languages.
Cryptography: For storing passwords securely.
Blockchain: For efficient data storage and retrieval.

Advantages and Disadvantages #

Advantages: #

Fast lookups: Average time complexity of O(1) for search, insert, and delete.
Flexible keys: Can use complex objects as keys.

Disadvantages: #

Unordered: Unlike arrays or linked lists, hash tables are inherently unordered.
Collisions: Need to handle collisions, which can degrade performance.
Space overhead: May require more memory than arrays.

Exercise #

Implement a hash table using open addressing with linear probing for collision resolution.
Create a function to find the first non-repeating character in a string using a hash table.
Implement a method to resize the hash table when the load factor exceeds 0.75.

Summary #

Today, we explored Hash Tables, a powerful data structure that provides efficient key-value pair storage and retrieval. We discussed the core concepts of hash functions and collision resolution techniques, and implemented a basic hash table using chaining.

Understanding hash tables is crucial for solving a wide range of problems efficiently, especially those involving quick lookups or avoiding duplicate values. As we progress through this challenge, you’ll find hash tables being used in various algorithms and applications.

Tomorrow, we’ll dive into string algorithms, exploring techniques for pattern matching and string manipulation. Stay tuned!

Hash Tables #

What is a Hash Table? #

How Hash Tables Work #

Hash Functions #

Collision Resolution Techniques #

1. Chaining #

Hash Tables #

What is a Hash Table? #

How Hash Tables Work #

Hash Functions #

Collision Resolution Techniques #

1. Chaining #

2. Open Addressing #

Implementing a Hash Table with Chaining #

Time Complexity #

Load Factor and Resizing #

Applications of Hash Tables #

Advantages and Disadvantages #

Advantages: #

Disadvantages: #

Exercise #

Summary #

Continue Reading

Re-enter Password

Confirm Action