Hash table implementations in Aleph-w: A comprehensive guide. More...

#include <iostream>
#include <string>
#include <chrono>
#include <random>
#include <iomanip>
#include <tpl_dynSetHash.H>
#include <tpl_odhash.H>

Include dependency graph for hash_tables_example.C:

Functions
static void	print_usage (const char *prog)

void	demo_dynset_lhash ()

void	demo_dynset_linhash ()

void	demo_odhash ()

void	demo_performance ()

int	main (int argc, char *argv[])

Detailed Description

Hash table implementations in Aleph-w: A comprehensive guide.

This example demonstrates the various hash table implementations available in Aleph-w, each optimized for different use cases. Hash tables provide O(1) average-case operations for insert, search, and delete, making them essential for high-performance applications.

What is a Hash Table?

A hash table is a data structure that maps keys to values using a hash function. It provides:

Fast lookup: O(1) average case
Fast insertion: O(1) average case
Fast deletion: O(1) average case

Trade-off: No ordering guarantee, worst case can be O(n)

Hash Table Implementations

Separate Chaining (DynSetLhash)

Strategy: Each bucket contains a linked list of elements

How it works:

Hash function maps key to bucket index
Collisions handled by chaining (linked list in bucket)
Elements with same hash stored in same list

Characteristics:

Pros:
- Graceful degradation (performance degrades gradually)
- Simple deletion (just remove from list)
- No clustering issues
Cons:
- Pointer overhead (one pointer per element)
- Cache misses (non-contiguous memory)
Best for: General purpose, frequent insertions/deletions

Linear Hashing (DynSetLinHash)

Strategy: Incremental hash table growth (linear hashing)

How it works:

Table grows incrementally (not all-at-once)
Uses multiple hash functions as table grows
Predictable performance characteristics

Characteristics:

Pros:
- Predictable performance (no sudden slowdowns)
- Incremental growth (no large rehash operations)
- Good for real-time systems
Cons:
- More complex implementation
- Slightly higher overhead
Best for: Real-time systems, growing datasets, predictable performance

Open Addressing - Double Hashing (ODhashTable)

Strategy: Store elements directly in array, use double hashing for collision resolution

How it works:

Elements stored directly in bucket array
On collision, use second hash function: h2(key)
Probe sequence: (h1(key) + i × h2(key)) mod m

Characteristics:

Pros:
- Cache-friendly (contiguous memory)
- No pointer overhead
- Better memory efficiency
Cons:
- Clustering can occur
- Deletion is complex (need tombstone markers)
- Fixed size (or expensive resizing)
Best for: Fixed size, high-performance requirements, memory efficiency

Open Addressing - Linear Probing (OLhashTable)

Strategy: Store elements in array, use linear probing for collisions

How it works:

On collision, check next bucket: (h(key) + i) mod m
Simple probing sequence

Characteristics:

Pros: Simple, cache-friendly
Cons: Primary clustering (performance degradation)
Best for: Simple cases, good hash functions

Complexity Analysis

Operation	Average Case	Worst Case	Notes
Insert	O(1)	O(n)	Depends on load factor
Search	O(1)	O(n)	Hash collisions
Delete	O(1)	O(n)	Varies by implementation

Load factor (α = n/m): Ratio of elements to buckets

Optimal: α ≈ 0.7-0.8
Too high: Performance degrades
Too low: Memory waste

Collision Resolution Strategies

Strategy	Method	Pros	Cons
Separate Chaining	Linked lists	Simple deletion	Pointer overhead
Linear Probing	Check next slot	Cache-friendly	Clustering
Double Hashing	Second hash function	Less clustering	More computation
Quadratic Probing	Quadratic sequence	Moderate clustering	Complex

When to Use Hash Tables

✅ Good for:

Fast lookups needed
Order doesn't matter
Keys have good hash functions
Large datasets

❌ Not good for:

Ordered iteration needed (use tree)
Range queries needed (use tree)
Worst-case guarantees needed (use tree)
Keys don't hash well

Comparison with Trees

Feature	Hash Table	Balanced Tree
Average lookup	O(1)	O(log n)
Worst-case lookup	O(n)	O(log n)
Ordered iteration	No	Yes
Range queries	No	Yes
Memory overhead	Lower	Higher

Hash Function Quality

A good hash function should:

Distribute uniformly: Minimize collisions
Be fast: Hash computation should be O(1)
Be deterministic: Same key → same hash

Poor hash functions lead to:

Many collisions
Degraded performance
Worst-case O(n) behavior

Usage Examples

# Run all hash table demonstrations
./hash_tables_example
 
# Compare specific implementations
./hash_tables_example -s chaining
./hash_tables_example -s linear
./hash_tables_example -s open
./hash_tables_example -s performance
 
# Show help
./hash_tables_example --help

See also: tpl_dynSetLhash.H Separate chaining hash set; tpl_dynSetLinHash.H Linear hashing implementation; tpl_dynMapOhash.H Open addressing hash map; dynmap_example.C Tree-based maps (alternative)

Author: Leandro Rabindranath León

Definition in file hash_tables_example.C.

Function Documentation

◆ demo_dynset_lhash()

void demo_dynset_lhash ( )

Definition at line 229 of file hash_tables_example.C.

References Aleph::DynHashTable< Key, HashTable, Cmp >::contains(), Aleph::divide_and_conquer_partition_dp(), FunctionalMethods< Container, T >::for_each(), Aleph::DynHashTable< Key, HashTable, Cmp >::insert(), and Aleph::DynHashTable< Key, HashTable, Cmp >::remove().

Referenced by main().

◆ demo_dynset_linhash()

void demo_dynset_linhash ( )

Definition at line 287 of file hash_tables_example.C.

References Aleph::DynHashTable< Key, HashTable, Cmp >::contains(), Aleph::divide_and_conquer_partition_dp(), Aleph::DynHashTable< Key, HashTable, Cmp >::insert(), and rng.

Referenced by main().

◆ demo_odhash()

void demo_odhash ( )

Definition at line 327 of file hash_tables_example.C.

References OhashCommon< HashTbl, Key >::capacity(), Aleph::divide_and_conquer_partition_dp(), OhashCommon< HashTbl, Key >::insert(), Aleph::ODhashTable< Key, Cmp >::remove(), Aleph::ODhashTable< Key, Cmp >::search(), and OhashCommon< HashTbl, Key >::size().

Referenced by main().

◆ demo_performance()

void demo_performance ( )

Definition at line 387 of file hash_tables_example.C.

References Aleph::DynHashTable< Key, HashTable, Cmp >::contains(), Aleph::divide_and_conquer_partition_dp(), OhashCommon< HashTbl, Key >::insert(), Aleph::DynHashTable< Key, HashTable, Cmp >::insert(), N, rng, Aleph::ODhashTable< Key, Cmp >::search(), and OhashCommon< HashTbl, Key >::size().

Referenced by main().

◆ main()

int main	(	int	argc,
		char *	argv[]
	)

Definition at line 485 of file hash_tables_example.C.

References demo_dynset_lhash(), demo_dynset_linhash(), demo_odhash(), demo_performance(), Aleph::divide_and_conquer_partition_dp(), and print_usage().

◆ print_usage()

static void print_usage ( const char * prog )

static

Definition at line 219 of file hash_tables_example.C.

References Aleph::divide_and_conquer_partition_dp().

Referenced by main().

Functions

Detailed Description

What is a Hash Table?

Hash Table Implementations

Separate Chaining (DynSetLhash)

Linear Hashing (DynSetLinHash)

Open Addressing - Double Hashing (ODhashTable)

Open Addressing - Linear Probing (OLhashTable)

Complexity Analysis

Collision Resolution Strategies

When to Use Hash Tables

Comparison with Trees

Hash Function Quality

Usage Examples

Function Documentation

◆ demo_dynset_lhash()

◆ demo_dynset_linhash()

◆ demo_odhash()

◆ demo_performance()

◆ main()

◆ print_usage()