Comprehensive benchmark comparing all set data structures in Aleph-w. More...

#include <iostream>
#include <iomanip>
#include <chrono>
#include <vector>
#include <algorithm>
#include <random>
#include <string>
#include <functional>
#include <cmath>
#include <set>
#include <unordered_set>
#include <tpl_dynSetTree.H>
#include <tpl_dynSkipList.H>
#include <tpl_dynSetHash.H>
#include <tpl_odhash.H>
#include <tpl_olhash.H>

Include dependency graph for set_structures_benchmark.cc:

Classes
struct	BenchmarkResult

Typedefs
using	AvlSet = DynSetTree< int, Avl_Tree >

using	RbSet = DynSetTree< int, Rb_Tree >

using	SplaySet = DynSetTree< int, Splay_Tree >

using	TreapSet = DynSetTree< int, Treap >

using	RandSet = DynSetTree< int, Rand_Tree >

using	AvlRkSet = DynSetTree< int, Avl_Tree_Rk >

using	TreapRkSet = DynSetTree< int, Treap_Rk >

using	SkipSet = DynSkipList< int >

using	LhashSet = DynSetLhash< int >

using	LinHashSet = DynSetLinHash< int >

using	ODHashSet = SetODhash< int >

using	OLHashSet = SetOLhash< int >

Functions
template<typename Func >
double	measure_ms (Func &&f)

template<typename Set >
BenchmarkResult	benchmark_tree_set (const string &name, const string &category, const vector< int > &data)

template<typename Set >
BenchmarkResult	benchmark_hash_set (const string &name, const string &category, const vector< int > &data)

BenchmarkResult	benchmark_skip_list (const string &name, const string &category, const vector< int > &data)

BenchmarkResult	benchmark_std_set (const string &name, const string &category, const vector< int > &data)

BenchmarkResult	benchmark_std_unordered_set (const string &name, const string &category, const vector< int > &data)

vector< int >	generate_random_data (size_t n, unsigned seed)

vector< int >	generate_sequential_data (size_t n)

void	print_results_table (const vector< BenchmarkResult > &results, size_t n)

void	print_operations_per_second (const vector< BenchmarkResult > &results, size_t n)

void	run_full_benchmark (size_t n, unsigned seed, bool include_ranked)

void	run_sequential_test (size_t n)

int	main (int argc, char *argv[])

Detailed Description

Comprehensive benchmark comparing all set data structures in Aleph-w.

This benchmark compares performance of different set implementations available in Aleph-w. Understanding the performance characteristics of each structure helps choose the right one for your use case.

Why Benchmark?

Different set structures have different performance characteristics:

Time complexity: Varies by operation and structure
Space complexity: Memory usage differs
Cache performance: Some structures are more cache-friendly
Real-world performance: Theoretical complexity doesn't tell the whole story

Tree-Based Sets (O(log n) operations)

AVL Tree

Balance: Strictly balanced (height difference ≤ 1)
Operations: O(log n) worst case
Best for: Read-heavy workloads, predictable performance
Trade-off: More rotations than Red-Black

Red-Black Tree

Balance: Relaxed balance (height ≤ 2 log(n+1))
Operations: O(log n) worst case
Best for: General-purpose, good all-around choice
Trade-off: Fewer rotations than AVL

Splay Tree

Balance: Self-adjusting (no explicit balance)
Operations: O(log n) amortized
Best for: Temporal locality, caching patterns
Trade-off: Worst case O(n), but adapts to access patterns

Treap

Balance: Randomized BST using heap priorities
Operations: O(log n) expected
Best for: Simple implementation, good average performance
Trade-off: Randomized, less predictable

Rand Tree

Balance: Another randomized approach
Operations: O(log n) expected
Best for: Alternative randomized structure

Skip Lists (Expected O(log n) operations)

DynSkipList

Structure: Probabilistic linked structure with multiple levels
Operations: O(log n) expected
Best for: Simplicity, concurrent extensions possible
Trade-off: Probabilistic, not deterministic

Hash Tables (Expected O(1) operations)

DynSetLhash (Separate Chaining)

Collision resolution: Linked lists per bucket
Operations: O(1) average, O(n) worst case
Best for: High load factors, many insertions/deletions
Trade-off: Pointer overhead, cache misses

DynSetLinHash (Linear Hashing)

Collision resolution: Linear probing with incremental growth
Operations: O(1) average
Best for: Dynamic workloads with varying sizes
Trade-off: More complex, predictable performance

SetODhash (Open Addressing - Double Hashing)

Collision resolution: Double hashing
Operations: O(1) average
Best for: Memory efficiency, cache-friendly access
Trade-off: Fixed size or expensive resizing

SetOLhash (Open Addressing - Linear Probing)

Collision resolution: Linear probing
Operations: O(1) average
Best for: Simple, cache-friendly
Trade-off: Clustering can degrade performance

Performance Comparison

Structure	Insert	Search	Delete	Ordered?	Best For
AVL Tree	O(log n)	O(log n)	O(log n)	Yes	Read-heavy
Red-Black	O(log n)	O(log n)	O(log n)	Yes	General
Splay	O(log n)*	O(log n)*	O(log n)*	Yes	Temporal locality
Treap	O(log n)**	O(log n)**	O(log n)**	Yes	Simple
Skip List	O(log n)**	O(log n)**	O(log n)**	Yes	Simplicity
Hash Tables	O(1)**	O(1)**	O(1)**	No	Maximum speed

When to Use What?

Structure	Best For	Avoid When
AVL/RB Tree	Ordered traversal, range queries, predictable O(log n)	Order not needed
Splay Tree	Temporal locality, caching patterns	Worst-case matters
Skip List	Simplicity, concurrent extensions	Deterministic needed
Hash Tables	Maximum speed, order doesn't matter	Ordered iteration needed
DynSetLhash	High load factors, many insertions/deletions	Memory critical
ODhash/OLhash	Memory efficiency, cache-friendly	Dynamic resizing needed
DynSetLinHash	Dynamic workloads, varying sizes	Fixed size preferred

Benchmark Metrics

This benchmark measures:

Insertion time: Time to insert elements
Search time: Time to search for elements
Deletion time: Time to delete elements
Memory usage: Space overhead
Cache performance: Cache misses/hits

Usage

# Run benchmark
./set_structures_benchmark
 
# Configure benchmark size and seed
./set_structures_benchmark --count 200000 --seed 123
 
# Include ranked tree variants
./set_structures_benchmark --ranked
 
# Also run sequential insertion test (limited internally)
./set_structures_benchmark --sequential
 
# Show help
./set_structures_benchmark --help