Aleph-w 3.0
A C++ Library for Data Structures and Algorithms
Loading...
Searching...
No Matches
cuckoo-filter.H File Reference

Probabilistic set membership with Cuckoo filters. More...

#include <cassert>
#include <cstddef>
#include <cmath>
#include <optional>
#include <random>
#include <algorithm>
#include <hash-fct.H>
#include <ah-errors.H>
#include <ahUtils.H>
#include <tpl_array.H>
Include dependency graph for cuckoo-filter.H:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  Aleph::Cuckoo_Filter< T, FingerprintBits, EntriesPerBucket >
 Industrial-grade Cuckoo Filter implementation. More...
 
struct  Aleph::Cuckoo_Filter< T, FingerprintBits, EntriesPerBucket >::Bucket
 
struct  Aleph::Cuckoo_Filter< T, FingerprintBits, EntriesPerBucket >::Victim
 Victim entry: a fingerprint displaced during a failed kick chain. More...
 

Namespaces

namespace  Aleph
 Main namespace for Aleph-w library functions.
 

Detailed Description

Probabilistic set membership with Cuckoo filters.

A Cuckoo filter is a space-efficient probabilistic data structure that supports adding, deleting, and checking for the presence of items. Compared to Bloom filters, Cuckoo filters support deletion and provide better space efficiency for low false positive rates.

Key Properties

  • Supports Deletion: Unlike standard Bloom filters, items can be removed.
  • High Space Efficiency: Better than Bloom filters for FP rates < 3%.
  • Lookup Performance: Constant time O(1) worst-case (2 bucket reads).
  • Insertion Performance: Amortized O(1).

How It Works

The filter consists of an array of buckets, where each bucket can hold multiple fingerprints (partial hashes).

  1. An item x has two candidate buckets: h1(x) and h2(x).
  2. h2(x) = h1(x) ⊕ hash(fingerprint(x)). This property allows finding the other candidate bucket without knowing the original item.
  3. On insertion, if both buckets are full, an existing fingerprint is "kicked out" to its alternative bucket, possibly triggering a chain of displacements (like a cuckoo bird).
Template Parameters
TItem type.
FingerprintBitsNumber of bits per fingerprint (e.g., 8, 12, 16).
EntriesPerBucketAssociativity (number of slots per bucket).
False Positive Rate
The theoretical FP rate is approximately 2 * EntriesPerBucket / 2^FingerprintBits. With the defaults (8-bit fingerprints, 4 entries per bucket) this is ~3.1 %.
Note
Inserting the same item twice stores two fingerprints. After a single remove() the filter will still report the item as present. Callers that need set semantics should guard with contains() before inserting.
Author
Leandro Rabindranath León

Definition in file cuckoo-filter.H.