Aleph-w 3.0
A C++ Library for Data Structures and Algorithms
Loading...
Searching...
No Matches
set_structures_benchmark.cc File Reference

Comprehensive benchmark comparing all set data structures in Aleph-w. More...

#include <iostream>
#include <iomanip>
#include <chrono>
#include <vector>
#include <algorithm>
#include <random>
#include <string>
#include <functional>
#include <cmath>
#include <set>
#include <unordered_set>
#include <tpl_dynSetTree.H>
#include <tpl_dynSkipList.H>
#include <tpl_dynSetHash.H>
#include <tpl_odhash.H>
#include <tpl_olhash.H>
Include dependency graph for set_structures_benchmark.cc:

Go to the source code of this file.

Classes

struct  BenchmarkResult
 

Typedefs

using AvlSet = DynSetTree< int, Avl_Tree >
 
using RbSet = DynSetTree< int, Rb_Tree >
 
using SplaySet = DynSetTree< int, Splay_Tree >
 
using TreapSet = DynSetTree< int, Treap >
 
using RandSet = DynSetTree< int, Rand_Tree >
 
using AvlRkSet = DynSetTree< int, Avl_Tree_Rk >
 
using TreapRkSet = DynSetTree< int, Treap_Rk >
 
using SkipSet = DynSkipList< int >
 
using LhashSet = DynSetLhash< int >
 
using LinHashSet = DynSetLinHash< int >
 
using ODHashSet = SetODhash< int >
 
using OLHashSet = SetOLhash< int >
 

Functions

template<typename Func >
double measure_ms (Func &&f)
 
template<typename Set >
BenchmarkResult benchmark_tree_set (const string &name, const string &category, const vector< int > &data)
 
template<typename Set >
BenchmarkResult benchmark_hash_set (const string &name, const string &category, const vector< int > &data)
 
BenchmarkResult benchmark_skip_list (const string &name, const string &category, const vector< int > &data)
 
BenchmarkResult benchmark_std_set (const string &name, const string &category, const vector< int > &data)
 
BenchmarkResult benchmark_std_unordered_set (const string &name, const string &category, const vector< int > &data)
 
vector< int > generate_random_data (size_t n, unsigned seed)
 
vector< int > generate_sequential_data (size_t n)
 
void print_results_table (const vector< BenchmarkResult > &results, size_t n)
 
void print_operations_per_second (const vector< BenchmarkResult > &results, size_t n)
 
void run_full_benchmark (size_t n, unsigned seed, bool include_ranked)
 
void run_sequential_test (size_t n)
 
int main (int argc, char *argv[])
 

Detailed Description

Comprehensive benchmark comparing all set data structures in Aleph-w.

This benchmark compares performance of different set implementations available in Aleph-w. Understanding the performance characteristics of each structure helps choose the right one for your use case.

Why Benchmark?

Different set structures have different performance characteristics:

  • Time complexity: Varies by operation and structure
  • Space complexity: Memory usage differs
  • Cache performance: Some structures are more cache-friendly
  • Real-world performance: Theoretical complexity doesn't tell the whole story

Tree-Based Sets (O(log n) operations)

AVL Tree

  • Balance: Strictly balanced (height difference ≤ 1)
  • Operations: O(log n) worst case
  • Best for: Read-heavy workloads, predictable performance
  • Trade-off: More rotations than Red-Black

Red-Black Tree

  • Balance: Relaxed balance (height ≤ 2 log(n+1))
  • Operations: O(log n) worst case
  • Best for: General-purpose, good all-around choice
  • Trade-off: Fewer rotations than AVL

Splay Tree

  • Balance: Self-adjusting (no explicit balance)
  • Operations: O(log n) amortized
  • Best for: Temporal locality, caching patterns
  • Trade-off: Worst case O(n), but adapts to access patterns

Treap

  • Balance: Randomized BST using heap priorities
  • Operations: O(log n) expected
  • Best for: Simple implementation, good average performance
  • Trade-off: Randomized, less predictable

Rand Tree

  • Balance: Another randomized approach
  • Operations: O(log n) expected
  • Best for: Alternative randomized structure

Skip Lists (Expected O(log n) operations)

DynSkipList

  • Structure: Probabilistic linked structure with multiple levels
  • Operations: O(log n) expected
  • Best for: Simplicity, concurrent extensions possible
  • Trade-off: Probabilistic, not deterministic

Hash Tables (Expected O(1) operations)

DynSetLhash (Separate Chaining)

  • Collision resolution: Linked lists per bucket
  • Operations: O(1) average, O(n) worst case
  • Best for: High load factors, many insertions/deletions
  • Trade-off: Pointer overhead, cache misses

DynSetLinHash (Linear Hashing)

  • Collision resolution: Linear probing with incremental growth
  • Operations: O(1) average
  • Best for: Dynamic workloads with varying sizes
  • Trade-off: More complex, predictable performance

SetODhash (Open Addressing - Double Hashing)

  • Collision resolution: Double hashing
  • Operations: O(1) average
  • Best for: Memory efficiency, cache-friendly access
  • Trade-off: Fixed size or expensive resizing

SetOLhash (Open Addressing - Linear Probing)

  • Collision resolution: Linear probing
  • Operations: O(1) average
  • Best for: Simple, cache-friendly
  • Trade-off: Clustering can degrade performance

Performance Comparison

Structure Insert Search Delete Ordered? Best For
AVL Tree O(log n) O(log n) O(log n) Yes Read-heavy
Red-Black O(log n) O(log n) O(log n) Yes General
Splay O(log n)* O(log n)* O(log n)* Yes Temporal locality
Treap O(log n)** O(log n)** O(log n)** Yes Simple
Skip List O(log n)** O(log n)** O(log n)** Yes Simplicity
Hash Tables O(1)** O(1)** O(1)** No Maximum speed

When to Use What?

Structure Best For Avoid When
AVL/RB Tree Ordered traversal, range queries, predictable O(log n) Order not needed
Splay Tree Temporal locality, caching patterns Worst-case matters
Skip List Simplicity, concurrent extensions Deterministic needed
Hash Tables Maximum speed, order doesn't matter Ordered iteration needed
DynSetLhash High load factors, many insertions/deletions Memory critical
ODhash/OLhash Memory efficiency, cache-friendly Dynamic resizing needed
DynSetLinHash Dynamic workloads, varying sizes Fixed size preferred

Benchmark Metrics

This benchmark measures:

  • Insertion time: Time to insert elements
  • Search time: Time to search for elements
  • Deletion time: Time to delete elements
  • Memory usage: Space overhead
  • Cache performance: Cache misses/hits

Usage

# Run benchmark
./set_structures_benchmark
# Configure benchmark size and seed
./set_structures_benchmark --count 200000 --seed 123
# Include ranked tree variants
./set_structures_benchmark --ranked
# Also run sequential insertion test (limited internally)
./set_structures_benchmark --sequential
# Show help
./set_structures_benchmark --help
See also
hash_tables_example.C Hash table implementations
dynmap_example.C Tree-based maps (similar to sets)
skiplist_example.C Skip list details
Author
Leandro Rabindranath León

Definition in file set_structures_benchmark.cc.

Typedef Documentation

◆ AvlRkSet

Definition at line 210 of file set_structures_benchmark.cc.

◆ AvlSet

using AvlSet = DynSetTree<int, Avl_Tree>

Definition at line 203 of file set_structures_benchmark.cc.

◆ LhashSet

using LhashSet = DynSetLhash<int>

Definition at line 217 of file set_structures_benchmark.cc.

◆ LinHashSet

using LinHashSet = DynSetLinHash<int>

Definition at line 218 of file set_structures_benchmark.cc.

◆ ODHashSet

using ODHashSet = SetODhash<int>

Definition at line 219 of file set_structures_benchmark.cc.

◆ OLHashSet

using OLHashSet = SetOLhash<int>

Definition at line 220 of file set_structures_benchmark.cc.

◆ RandSet

using RandSet = DynSetTree<int, Rand_Tree>

Definition at line 207 of file set_structures_benchmark.cc.

◆ RbSet

using RbSet = DynSetTree<int, Rb_Tree>

Definition at line 204 of file set_structures_benchmark.cc.

◆ SkipSet

using SkipSet = DynSkipList<int>

Definition at line 214 of file set_structures_benchmark.cc.

◆ SplaySet

Definition at line 205 of file set_structures_benchmark.cc.

◆ TreapRkSet

Definition at line 211 of file set_structures_benchmark.cc.

◆ TreapSet

using TreapSet = DynSetTree<int, Treap>

Definition at line 206 of file set_structures_benchmark.cc.

Function Documentation

◆ benchmark_hash_set()

template<typename Set >
BenchmarkResult benchmark_hash_set ( const string &  name,
const string &  category,
const vector< int > &  data 
)

◆ benchmark_skip_list()

◆ benchmark_std_set()

BenchmarkResult benchmark_std_set ( const string &  name,
const string &  category,
const vector< int > &  data 
)

◆ benchmark_std_unordered_set()

BenchmarkResult benchmark_std_unordered_set ( const string &  name,
const string &  category,
const vector< int > &  data 
)

◆ benchmark_tree_set()

template<typename Set >
BenchmarkResult benchmark_tree_set ( const string &  name,
const string &  category,
const vector< int > &  data 
)

◆ generate_random_data()

vector< int > generate_random_data ( size_t  n,
unsigned  seed 
)

◆ generate_sequential_data()

vector< int > generate_sequential_data ( size_t  n)

Definition at line 463 of file set_structures_benchmark.cc.

◆ main()

int main ( int  argc,
char *  argv[] 
)

Definition at line 678 of file set_structures_benchmark.cc.

◆ measure_ms()

template<typename Func >
double measure_ms ( Func &&  f)

◆ print_operations_per_second()

void print_operations_per_second ( const vector< BenchmarkResult > &  results,
size_t  n 
)

Definition at line 520 of file set_structures_benchmark.cc.

References Aleph::maps().

Referenced by run_full_benchmark().

◆ print_results_table()

void print_results_table ( const vector< BenchmarkResult > &  results,
size_t  n 
)

Definition at line 476 of file set_structures_benchmark.cc.

References Aleph::DynList< T >::empty(), and Aleph::maps().

Referenced by run_full_benchmark().

◆ run_full_benchmark()

void run_full_benchmark ( size_t  n,
unsigned  seed,
bool  include_ranked 
)

◆ run_sequential_test()

void run_sequential_test ( size_t  n)

Definition at line 641 of file set_structures_benchmark.cc.