Published on

B-Trees - Multi-way Search Trees for External Storage

Authors
  • avatar
    Name
    Balaram Shiwakoti
    Twitter

B-trees were initially the most confusing tree structure for me during DSA preparation. I kept thinking, "Why do we need trees with so many children when binary trees work fine?" But once I understood that B-trees are designed for external storage where disk I/O is expensive, and how they minimize the number of disk accesses, everything clicked. Let me share what I've learned about these crucial data structures.

Introduction to B-Trees

When I first encountered B-trees, I was puzzled by their complexity compared to binary search trees. But then I learned they were specifically designed for systems where data doesn't fit in main memory - like databases and file systems. The "B" in B-tree stands for "balanced," and they're optimized for storage systems that read and write large blocks of data.

B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. Unlike binary trees, B-trees can have many children per node.

B-Tree Properties

Fundamental Properties

For a B-tree of order m (also called degree):

  1. Every node has at most m children
  2. Every non-leaf node (except root) has at least ⌈m/2⌉ children
  3. Root has at least 2 children if it's not a leaf
  4. All leaves are at the same level
  5. A non-leaf node with k children contains k-1 keys

Key Constraints

For a node with k keys:

  • Keys are stored in sorted order: key₁ ≤ key₂ ≤ ... ≤ keyₖ
  • Number of keys: ⌈m/2⌉ - 1 ≤ k ≤ m - 1 (except root)
  • Root can have 1 to m-1 keys

Example B-tree of order 5:

Minimum keys per node:5/2- 1 = 2
Maximum keys per node: 5 - 1 = 4
Minimum children per non-leaf:5/2= 3
Maximum children per node: 5

I remember drawing countless B-tree diagrams to understand these constraints!

B-Tree Structure

Node Structure

class BTreeNode:
    def __init__(self, is_leaf=False):
        self.keys = []          # List of keys
        self.children = []      # List of child pointers
        self.is_leaf = is_leaf  # True if leaf node
        self.num_keys = 0       # Current number of keys

    def is_full(self, max_keys):
        return self.num_keys == max_keys

class BTree:
    def __init__(self, degree):
        self.root = BTreeNode(is_leaf=True)
        self.degree = degree    # Maximum degree (order)
        self.max_keys = degree - 1
        self.min_keys = (degree + 1) // 2 - 1

Example B-Tree Structure

B-tree of order 5 with keys [1,3,7,10,16,20,30]:

Level 0:           [10]
                  /    \
Level 1:      [3,7]    [16,20]
             /  |  \   /  |   \
Level 2:   [1] [3] [7] [16] [20] [30]

Properties verified:

  • All leaves at same level ✓
  • Internal nodes have 2-4 keys ✓
  • Root has 1 key (allowed) ✓
  • All keys in sorted order ✓

B-Tree Operations

Search Operation

Algorithm:

  1. Start at root
  2. Find the child pointer to follow using key comparisons
  3. Recursively search in appropriate child
  4. Return key if found in current node

Implementation:

def search(self, node, key):
    """Search for key in B-tree"""
    i = 0

    # Find the first key greater than or equal to key
    while i < node.num_keys and key > node.keys[i]:
        i += 1

    # If key found
    if i < node.num_keys and key == node.keys[i]:
        return node, i

    # If leaf node, key not found
    if node.is_leaf:
        return None

    # Recursively search in appropriate child
    return self.search(node.children[i], key)

def search_wrapper(self, key):
    return self.search(self.root, key)

Time Complexity: O(log n) where n is number of keys

Insertion Operation

Insertion is more complex due to the need to maintain B-tree properties:

Algorithm:

  1. Find the leaf node where key should be inserted
  2. If leaf has space, insert key in sorted order
  3. If leaf is full, split the node and promote middle key
  4. Recursively handle splits up the tree

Implementation:

def insert(self, key):
    """Insert key into B-tree"""
    root = self.root

    # If root is full, create new root
    if root.is_full(self.max_keys):
        new_root = BTreeNode()
        self.root = new_root
        new_root.children.append(root)
        self.split_child(new_root, 0)
        root = new_root

    self.insert_non_full(root, key)

def insert_non_full(self, node, key):
    """Insert key into non-full node"""
    i = node.num_keys - 1

    if node.is_leaf:
        # Insert key into leaf node
        node.keys.append(0)  # Make space

        # Shift keys to make room for new key
        while i >= 0 and key < node.keys[i]:
            node.keys[i + 1] = node.keys[i]
            i -= 1

        node.keys[i + 1] = key
        node.num_keys += 1
    else:
        # Find child to insert into
        while i >= 0 and key < node.keys[i]:
            i -= 1
        i += 1

        # If child is full, split it
        if node.children[i].is_full(self.max_keys):
            self.split_child(node, i)

            # After split, decide which child to insert into
            if key > node.keys[i]:
                i += 1

        self.insert_non_full(node.children[i], key)

def split_child(self, parent, index):
    """Split full child at given index"""
    full_child = parent.children[index]
    new_child = BTreeNode(is_leaf=full_child.is_leaf)

    mid_index = self.max_keys // 2

    # Move half the keys to new child
    new_child.keys = full_child.keys[mid_index + 1:]
    new_child.num_keys = len(new_child.keys)

    # Move half the children to new child (if not leaf)
    if not full_child.is_leaf:
        new_child.children = full_child.children[mid_index + 1:]

    # Reduce keys in original child
    full_child.keys = full_child.keys[:mid_index]
    full_child.num_keys = len(full_child.keys)

    if not full_child.is_leaf:
        full_child.children = full_child.children[:mid_index + 1]

    # Insert new child into parent
    parent.children.insert(index + 1, new_child)

    # Move middle key up to parent
    parent.keys.insert(index, full_child.keys[mid_index])
    parent.num_keys += 1

Insertion Example Walkthrough

Let's trace inserting keys [10, 20, 5, 6, 12, 30, 7, 17] into a B-tree of order 3:

Insert 10, 20:

[10, 20]  (leaf node, has space)

Insert 5:

[5, 10, 20]  (leaf node full - max 2 keys for order 3)

Insert 6 (causes split):

Before split: [5, 10, 20] - full!
After split:     [10]
               /      \
            [5]        [20]

Insert 6:        [10]
               /      \
            [5, 6]     [20]

Insert 12:

        [10]
      /      \
   [5, 6]    [12, 20]

Insert 30:

        [10]
      /      \
   [5, 6]    [12, 20, 30]  (full!)

Insert 7 (causes split in left child):

Before: [5, 6] becomes [5, 6, 7] (full!)
After split:
        [6, 10]
      /   |    \
    [5]  [7]   [12, 20, 30]

This step-by-step visualization really helped me understand the splitting process!

Deletion Operation

Deletion is the most complex B-tree operation:

Cases to handle:

  1. Key in leaf node - simple removal
  2. Key in internal node - replace with predecessor/successor
  3. Node becomes too small - borrow from sibling or merge

Deletion algorithm steps:

  1. Find the key position in the current node
  2. If key found in current node:
    • If it's a leaf node: simply remove the key
    • If it's an internal node: replace with predecessor or successor
  3. If key not in current node:
    • If it's a leaf: key doesn't exist
    • Otherwise: ensure child has enough keys before descending
  4. Handle underflow by borrowing from siblings or merging nodes

Three main cases for internal node deletion:

  • Case 1: Left child has extra keys - use predecessor
  • Case 2: Right child has extra keys - use successor
  • Case 3: Both children have minimum keys - merge and recurse

Time and Space Complexity

Time Complexity

For a B-tree with n keys and degree m:

  • Height: O(log_m n)
  • Search: O(log_m n)
  • Insertion: O(log_m n)
  • Deletion: O(log_m n)

Why logarithmic base m?

  • Each level can have up to m children
  • Height is minimized when tree is full
  • Maximum height = log_m(n+1)

Space Complexity

  • Storage: O(n) for n keys
  • Node overhead: Each node stores up to m-1 keys and m pointers

Disk I/O Analysis

Key advantage: Number of disk accesses = O(log_m n)

For large m (e.g., m = 1000):

  • Tree with 1 billion keys has height ≤ 3
  • Maximum 3 disk accesses for any operation!

This was the "aha!" moment for me - understanding why large degree matters for disk-based storage.

B-Tree Variants

B+ Trees

Differences from B-trees:

  • All data stored in leaf nodes
  • Internal nodes only store keys for navigation
  • Leaf nodes linked for sequential access
  • Better for range queries

Structure:

Internal:    [10, 20]
            /    |    \
Leaves:   [5,7] [10,15] [20,25,30]
           |      |       |
          next -> next -> next -> null

B* Trees

Improvements:

  • Nodes kept 2/3 full instead of 1/2 full
  • Better space utilization
  • More complex insertion/deletion

Applications of B-Trees

Database Systems

Primary use case: Database indexes

-- B-tree index on employee_id
CREATE INDEX idx_employee_id ON employees(employee_id);

-- Query uses B-tree for fast lookup
SELECT * FROM employees WHERE employee_id = 12345;

Benefits:

  • Fast equality searches: O(log n)
  • Range queries: O(log n + k) where k = result size
  • Maintains sorted order

File Systems

Examples:

  • NTFS: Uses B+ trees for file allocation
  • ext4: Directory indexing with B-trees
  • ZFS: Metadata organization

Advantages:

  • Efficient directory lookups
  • Fast file allocation
  • Balanced tree structure

Operating Systems

Virtual memory management:

  • Page table organization
  • Memory allocation tracking
  • Process scheduling queues

Implementation Considerations

Node Size Optimization

Calculating optimal degree based on page size:

For efficient disk I/O, node size should match disk page size (typically 4KB):

Formula: degree × key_size + (degree + 1) × pointer_size ≤ page_size

Example calculation:

  • Page size: 4096 bytes
  • Key size: 8 bytes (64-bit integers)
  • Pointer size: 8 bytes (64-bit pointers)
  • Optimal degree: (4096 - 8) ÷ (8 + 8) = 255

This ensures each node fits exactly in one disk page, minimizing I/O operations.

Disk I/O Optimization

Key strategies for efficient disk operations:

  1. Node size matching: Align node size with disk page size (4KB)
  2. Sequential reads: Read entire nodes in single disk operation
  3. Write buffering: Batch writes and flush periodically
  4. Serialization: Efficient conversion between memory and disk format

Disk operations:

  • Read node: Seek to offset, read full page, deserialize data
  • Write node: Serialize data, seek to offset, write page, flush to disk
  • Caching: Keep frequently accessed nodes in memory

Performance Comparison

B-Tree vs Binary Search Tree

AspectB-Tree (degree 100)Binary Search Tree
Height (1M keys)~3~20
Disk accesses320
Cache performanceBetterWorse
Memory usageHigher per nodeLower per node

B-Tree vs Hash Table

OperationB-TreeHash Table
SearchO(log n)O(1) average
Range queryO(log n + k)O(n)
Ordered traversalO(n)O(n log n)
Worst caseO(log n)O(n)

Common Implementation Pitfalls

1. Incorrect Split Logic

Common mistake: Using degree // 2 for mid-index calculation Correct approach: Use (degree - 1) // 2 to ensure proper key distribution Why: Handles both even and odd degree values correctly

2. Forgetting to Update Pointers

Common mistake: Not updating parent pointers after node splits Correct approach: Always maintain parent-child relationships Impact: Broken pointers can cause traversal errors and memory leaks

3. Boundary Condition Errors

Common mistake: Not checking for empty tree before operations Correct approach: Always validate tree state before proceeding Examples: Check if root exists, verify node has keys before accessing

Advanced Topics

Concurrent B-Trees

  • Lock-free algorithms
  • Copy-on-write techniques
  • Optimistic concurrency control

Persistent B-Trees

  • Immutable versions
  • Version control systems
  • Functional programming applications

Distributed B-Trees

  • Sharding strategies
  • Consistency protocols
  • Replication techniques

Conclusion

B-trees are fundamental data structures optimized for external storage systems. Key takeaways:

Core Concepts:

  • Multi-way trees with balanced height
  • Optimized for disk I/O operations
  • Maintain sorted order efficiently

Performance:

  • O(log_m n) operations where m is degree
  • Minimizes expensive disk accesses
  • Better cache locality than binary trees

Applications:

  • Database indexing systems
  • File system organization
  • Operating system components

For loksewa preparation, focus on:

  • Understanding B-tree properties and constraints
  • Implementing basic insertion with splitting
  • Analyzing time complexity in terms of degree
  • Comparing with other tree structures
  • Recognizing when B-trees are appropriate

Remember: B-trees are designed for the reality of computer systems where memory hierarchy matters. The large degree isn't just theoretical - it's a practical optimization for disk-based storage where minimizing I/O operations is crucial for performance.

The key insight is that B-trees trade some complexity for dramatic improvements in real-world performance when dealing with large datasets that don't fit in main memory!