Aleph-w 3.0
A C++ Library for Data Structures and Algorithms
Loading...
Searching...
No Matches
writeSplit.C File Reference

Demonstrates BST split by position operation. More...

#include <iostream>
#include <fstream>
#include <tclap/CmdLine.h>
#include <aleph.H>
#include <tpl_dynArray.H>
#include <tpl_binNodeXt.H>
#include <tpl_binNodeUtils.H>
#include <tpl_binTree.H>
Include dependency graph for writeSplit.C:

Go to the source code of this file.

Typedefs

typedef BinNodeXt< int > Node
 

Functions

void print_key (Node *p, int, int)
 
void print_key1 (Node *p, int, int)
 
void print_key2 (Node *p, int, int)
 
int main (int argc, char *argv[])
 

Variables

ofstream output
 
ofstream output_1
 
ofstream output_2
 

Detailed Description

Demonstrates BST split by position operation.

This program demonstrates the split operation for binary search trees with rank information. It divides a BST at a given position, producing two separate BSTs and generating visualization files. Split is a fundamental operation that enables efficient tree manipulation.

What is Tree Split?

Definition

The split operation divides a BST at position k, creating:

  • Left tree: Contains elements at positions 0 to k-1 (k smallest elements)
  • Right tree: Contains elements at positions k to n-1 (remaining elements)

Key properties:

  • Both resulting trees maintain the BST property
  • All keys in left tree < all keys in right tree
  • Original tree is divided at position k

Visual Example

Original tree (n=7): Split at position 3:
4 Left (0-2): Right (3-6):
/ \ 2 4
2 6 / \ / \
/ \ / \ 1 3 5 6
1 3 5 7 / \
7

Rank Information

Why Rank Information is Needed

Standard BSTs don't support efficient position-based operations:

  • Without rank: Finding k-th element requires O(n) time
  • With rank: Finding k-th element requires O(log n) time

BinNodeXt (Extended Binary Node)

This example uses BinNodeXt which maintains:

  • Subtree size: Number of nodes in subtree rooted at this node
  • Order-statistics helpers: select(root, k) and inorder_position(...)/find_position(...)
  • Maintained: Updated during insert, delete, rotations

Rank Operations

  • **select(root, k)**: Find k-th smallest element in O(h)
  • **inorder_position(root, key, ...)**: Find inorder position of a key in O(h)
  • **find_position(root, key, ...)**: Find exact/insertion position of a key in O(h)
  • Enables: Efficient position-based splitting

Algorithm

Split by Position

Split by position works by:

split_by_position(tree, k):
1. Find node at position k using select(root, k) // O(h)
2. Disconnect tree at that node
3. Rebuild left subtree (positions 0 to k-1)
4. Rebuild right subtree (positions k+1 to n-1)
5. Return (left_tree, node_at_k, right_tree)
Node * select(Node *root, const int &i)
Definition btreepic.C:902
__gmp_expr< T, __gmp_binary_expr< __gmp_expr< T, U >, unsigned long int, __gmp_root_function > > root(const __gmp_expr< T, U > &expr, unsigned long int l)
Definition gmpfrxx.h:4060

Detailed Steps

  1. Find split node: Use rank information to find node at position k
    • Time: O(h) where h is the tree height
  2. Disconnect: Remove node from tree structure
    • Time: O(h) for this ranked BST representation
  3. Rebuild subtrees: Construct left and right trees
    • Time: O(h)

Complexity

Tree Type Complexity Notes
Balanced with rank O(log n) Height is O(log n)
Balanced without rank O(n) Must traverse to find position
Unbalanced O(n) Worst case (height can be linear)

Applications

Order Statistics

  • K-th element: Find k-th smallest/largest element
  • Median: Find median element efficiently
  • Percentiles: Find elements at specific percentiles

Range Queries

  • Extract range: Split tree to get elements in range [a, b]
  • Range operations: Process elements in specific range
  • Partitioning: Divide data into ranges

Tree Operations

  • Building block: Used in many tree algorithms
  • Tree manipulation: Enable complex tree operations
  • Functional programming: Immutable tree operations

Database Indexing

  • Partitioning: Partition index structures
  • Range queries: Efficient range query processing
  • Bulk operations: Process subsets of data

Complementary Operations

Split and Join

  • Split: Divides tree into parts (this operation)
  • Join: Combines trees (see writeJoin.C)
  • Together: Enable powerful tree manipulation

Example: Range Extraction

// Extract range [a, b] from tree
(left, middle, right) = split_3_ways(tree, a, b)
// left: elements < a
// middle: elements in [a, b]
// right: elements > b

Split Variants

Split by Position

  • This example: Split at position k
  • Uses rank: Requires rank information
  • Efficient: O(log n) with rank

Split by Value

  • Alternative: Split at value x
  • No rank needed: Can work without rank
  • Efficient: O(log n) for balanced trees

Three-Way Split

  • Advanced: Split into three parts
  • Range extraction: Extract range [a, b]
  • Useful: For range queries

Output Files

  • **split-before-aux.Tree**: Original tree with SPLIT marker (preorder)
    • Shows tree before split
    • Marks split position
  • **split-1-aux.Tree**: Left subtree - positions 0 to pos-1 (preorder)
    • Contains k smallest elements
    • Maintains BST property
  • **split-2-aux.Tree**: Right subtree - positions pos to n-1 (preorder)
    • Contains remaining elements
    • Maintains BST property

All files can be visualized with btreepic to see the split operation.

Usage

# Split tree with 30 nodes at position 10
writeSplit -n 30 -p 10
# Use specific seed for reproducibility
writeSplit -n 50 -s 42 -p 25
# Split at different positions
writeSplit -n 100 -p 50 # Split at middle
See also
tpl_binNodeXt.H Extended binary node with rank support
tpl_binNodeUtils.H BST utility functions including split
writeJoin.C Complementary join operation
writeRankTree.C Rank operations demonstration
Author
Leandro Rabindranath León

Definition in file writeSplit.C.

Typedef Documentation

◆ Node

typedef BinNodeXt<int> Node

Definition at line 228 of file writeSplit.C.

Function Documentation

◆ main()

◆ print_key()

void print_key ( Node p,
int  ,
int   
)

Definition at line 235 of file writeSplit.C.

References Aleph::BinNodeXt< Key >::get_key(), and output.

Referenced by main().

◆ print_key1()

void print_key1 ( Node p,
int  ,
int   
)

Definition at line 240 of file writeSplit.C.

References output_1.

Referenced by main().

◆ print_key2()

void print_key2 ( Node p,
int  ,
int   
)

Definition at line 245 of file writeSplit.C.

References output_2.

Referenced by main().

Variable Documentation

◆ output

ofstream output

Definition at line 231 of file writeSplit.C.

Referenced by main(), and print_key().

◆ output_1

ofstream output_1

Definition at line 232 of file writeSplit.C.

Referenced by main(), and print_key1().

◆ output_2

ofstream output_2

Definition at line 233 of file writeSplit.C.

Referenced by main(), and print_key2().