- Published on
B-Trees - Multi-way Search Trees for External Storage
- Authors
- Name
- Balaram Shiwakoti
B-trees were initially the most confusing tree structure for me during DSA preparation. I kept thinking, "Why do we need trees with so many children when binary trees work fine?" But once I understood that B-trees are designed for external storage where disk I/O is expensive, and how they minimize the number of disk accesses, everything clicked. Let me share what I've learned about these crucial data structures.
Introduction to B-Trees
When I first encountered B-trees, I was puzzled by their complexity compared to binary search trees. But then I learned they were specifically designed for systems where data doesn't fit in main memory - like databases and file systems. The "B" in B-tree stands for "balanced," and they're optimized for storage systems that read and write large blocks of data.
B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. Unlike binary trees, B-trees can have many children per node.
B-Tree Properties
Fundamental Properties
For a B-tree of order m (also called degree):
- Every node has at most m children
- Every non-leaf node (except root) has at least ⌈m/2⌉ children
- Root has at least 2 children if it's not a leaf
- All leaves are at the same level
- A non-leaf node with k children contains k-1 keys
Key Constraints
For a node with k keys:
- Keys are stored in sorted order: key₁ ≤ key₂ ≤ ... ≤ keyₖ
- Number of keys: ⌈m/2⌉ - 1 ≤ k ≤ m - 1 (except root)
- Root can have 1 to m-1 keys
Example B-tree of order 5:
Minimum keys per node: ⌈5/2⌉ - 1 = 2
Maximum keys per node: 5 - 1 = 4
Minimum children per non-leaf: ⌈5/2⌉ = 3
Maximum children per node: 5
I remember drawing countless B-tree diagrams to understand these constraints!
B-Tree Structure
Node Structure
class BTreeNode:
def __init__(self, is_leaf=False):
self.keys = [] # List of keys
self.children = [] # List of child pointers
self.is_leaf = is_leaf # True if leaf node
self.num_keys = 0 # Current number of keys
def is_full(self, max_keys):
return self.num_keys == max_keys
class BTree:
def __init__(self, degree):
self.root = BTreeNode(is_leaf=True)
self.degree = degree # Maximum degree (order)
self.max_keys = degree - 1
self.min_keys = (degree + 1) // 2 - 1
Example B-Tree Structure
B-tree of order 5 with keys [1,3,7,10,16,20,30]:
Level 0: [10]
/ \
Level 1: [3,7] [16,20]
/ | \ / | \
Level 2: [1] [3] [7] [16] [20] [30]
Properties verified:
- All leaves at same level ✓
- Internal nodes have 2-4 keys ✓
- Root has 1 key (allowed) ✓
- All keys in sorted order ✓
B-Tree Operations
Search Operation
Algorithm:
- Start at root
- Find the child pointer to follow using key comparisons
- Recursively search in appropriate child
- Return key if found in current node
Implementation:
def search(self, node, key):
"""Search for key in B-tree"""
i = 0
# Find the first key greater than or equal to key
while i < node.num_keys and key > node.keys[i]:
i += 1
# If key found
if i < node.num_keys and key == node.keys[i]:
return node, i
# If leaf node, key not found
if node.is_leaf:
return None
# Recursively search in appropriate child
return self.search(node.children[i], key)
def search_wrapper(self, key):
return self.search(self.root, key)
Time Complexity: O(log n) where n is number of keys
Insertion Operation
Insertion is more complex due to the need to maintain B-tree properties:
Algorithm:
- Find the leaf node where key should be inserted
- If leaf has space, insert key in sorted order
- If leaf is full, split the node and promote middle key
- Recursively handle splits up the tree
Implementation:
def insert(self, key):
"""Insert key into B-tree"""
root = self.root
# If root is full, create new root
if root.is_full(self.max_keys):
new_root = BTreeNode()
self.root = new_root
new_root.children.append(root)
self.split_child(new_root, 0)
root = new_root
self.insert_non_full(root, key)
def insert_non_full(self, node, key):
"""Insert key into non-full node"""
i = node.num_keys - 1
if node.is_leaf:
# Insert key into leaf node
node.keys.append(0) # Make space
# Shift keys to make room for new key
while i >= 0 and key < node.keys[i]:
node.keys[i + 1] = node.keys[i]
i -= 1
node.keys[i + 1] = key
node.num_keys += 1
else:
# Find child to insert into
while i >= 0 and key < node.keys[i]:
i -= 1
i += 1
# If child is full, split it
if node.children[i].is_full(self.max_keys):
self.split_child(node, i)
# After split, decide which child to insert into
if key > node.keys[i]:
i += 1
self.insert_non_full(node.children[i], key)
def split_child(self, parent, index):
"""Split full child at given index"""
full_child = parent.children[index]
new_child = BTreeNode(is_leaf=full_child.is_leaf)
mid_index = self.max_keys // 2
# Move half the keys to new child
new_child.keys = full_child.keys[mid_index + 1:]
new_child.num_keys = len(new_child.keys)
# Move half the children to new child (if not leaf)
if not full_child.is_leaf:
new_child.children = full_child.children[mid_index + 1:]
# Reduce keys in original child
full_child.keys = full_child.keys[:mid_index]
full_child.num_keys = len(full_child.keys)
if not full_child.is_leaf:
full_child.children = full_child.children[:mid_index + 1]
# Insert new child into parent
parent.children.insert(index + 1, new_child)
# Move middle key up to parent
parent.keys.insert(index, full_child.keys[mid_index])
parent.num_keys += 1
Insertion Example Walkthrough
Let's trace inserting keys [10, 20, 5, 6, 12, 30, 7, 17] into a B-tree of order 3:
Insert 10, 20:
[10, 20] (leaf node, has space)
Insert 5:
[5, 10, 20] (leaf node full - max 2 keys for order 3)
Insert 6 (causes split):
Before split: [5, 10, 20] - full!
After split: [10]
/ \
[5] [20]
Insert 6: [10]
/ \
[5, 6] [20]
Insert 12:
[10]
/ \
[5, 6] [12, 20]
Insert 30:
[10]
/ \
[5, 6] [12, 20, 30] (full!)
Insert 7 (causes split in left child):
Before: [5, 6] becomes [5, 6, 7] (full!)
After split:
[6, 10]
/ | \
[5] [7] [12, 20, 30]
This step-by-step visualization really helped me understand the splitting process!
Deletion Operation
Deletion is the most complex B-tree operation:
Cases to handle:
- Key in leaf node - simple removal
- Key in internal node - replace with predecessor/successor
- Node becomes too small - borrow from sibling or merge
Deletion algorithm steps:
- Find the key position in the current node
- If key found in current node:
- If it's a leaf node: simply remove the key
- If it's an internal node: replace with predecessor or successor
- If key not in current node:
- If it's a leaf: key doesn't exist
- Otherwise: ensure child has enough keys before descending
- Handle underflow by borrowing from siblings or merging nodes
Three main cases for internal node deletion:
- Case 1: Left child has extra keys - use predecessor
- Case 2: Right child has extra keys - use successor
- Case 3: Both children have minimum keys - merge and recurse
Time and Space Complexity
Time Complexity
For a B-tree with n keys and degree m:
- Height: O(log_m n)
- Search: O(log_m n)
- Insertion: O(log_m n)
- Deletion: O(log_m n)
Why logarithmic base m?
- Each level can have up to m children
- Height is minimized when tree is full
- Maximum height = log_m(n+1)
Space Complexity
- Storage: O(n) for n keys
- Node overhead: Each node stores up to m-1 keys and m pointers
Disk I/O Analysis
Key advantage: Number of disk accesses = O(log_m n)
For large m (e.g., m = 1000):
- Tree with 1 billion keys has height ≤ 3
- Maximum 3 disk accesses for any operation!
This was the "aha!" moment for me - understanding why large degree matters for disk-based storage.
B-Tree Variants
B+ Trees
Differences from B-trees:
- All data stored in leaf nodes
- Internal nodes only store keys for navigation
- Leaf nodes linked for sequential access
- Better for range queries
Structure:
Internal: [10, 20]
/ | \
Leaves: [5,7] [10,15] [20,25,30]
| | |
next -> next -> next -> null
B* Trees
Improvements:
- Nodes kept 2/3 full instead of 1/2 full
- Better space utilization
- More complex insertion/deletion
Applications of B-Trees
Database Systems
Primary use case: Database indexes
-- B-tree index on employee_id
CREATE INDEX idx_employee_id ON employees(employee_id);
-- Query uses B-tree for fast lookup
SELECT * FROM employees WHERE employee_id = 12345;
Benefits:
- Fast equality searches: O(log n)
- Range queries: O(log n + k) where k = result size
- Maintains sorted order
File Systems
Examples:
- NTFS: Uses B+ trees for file allocation
- ext4: Directory indexing with B-trees
- ZFS: Metadata organization
Advantages:
- Efficient directory lookups
- Fast file allocation
- Balanced tree structure
Operating Systems
Virtual memory management:
- Page table organization
- Memory allocation tracking
- Process scheduling queues
Implementation Considerations
Node Size Optimization
Calculating optimal degree based on page size:
For efficient disk I/O, node size should match disk page size (typically 4KB):
Formula: degree × key_size + (degree + 1) × pointer_size ≤ page_size
Example calculation:
- Page size: 4096 bytes
- Key size: 8 bytes (64-bit integers)
- Pointer size: 8 bytes (64-bit pointers)
- Optimal degree: (4096 - 8) ÷ (8 + 8) = 255
This ensures each node fits exactly in one disk page, minimizing I/O operations.
Disk I/O Optimization
Key strategies for efficient disk operations:
- Node size matching: Align node size with disk page size (4KB)
- Sequential reads: Read entire nodes in single disk operation
- Write buffering: Batch writes and flush periodically
- Serialization: Efficient conversion between memory and disk format
Disk operations:
- Read node: Seek to offset, read full page, deserialize data
- Write node: Serialize data, seek to offset, write page, flush to disk
- Caching: Keep frequently accessed nodes in memory
Performance Comparison
B-Tree vs Binary Search Tree
Aspect | B-Tree (degree 100) | Binary Search Tree |
---|---|---|
Height (1M keys) | ~3 | ~20 |
Disk accesses | 3 | 20 |
Cache performance | Better | Worse |
Memory usage | Higher per node | Lower per node |
B-Tree vs Hash Table
Operation | B-Tree | Hash Table |
---|---|---|
Search | O(log n) | O(1) average |
Range query | O(log n + k) | O(n) |
Ordered traversal | O(n) | O(n log n) |
Worst case | O(log n) | O(n) |
Common Implementation Pitfalls
1. Incorrect Split Logic
Common mistake: Using degree // 2
for mid-index calculation Correct approach: Use (degree - 1) // 2
to ensure proper key distribution Why: Handles both even and odd degree values correctly
2. Forgetting to Update Pointers
Common mistake: Not updating parent pointers after node splits Correct approach: Always maintain parent-child relationships Impact: Broken pointers can cause traversal errors and memory leaks
3. Boundary Condition Errors
Common mistake: Not checking for empty tree before operations Correct approach: Always validate tree state before proceeding Examples: Check if root exists, verify node has keys before accessing
Advanced Topics
Concurrent B-Trees
- Lock-free algorithms
- Copy-on-write techniques
- Optimistic concurrency control
Persistent B-Trees
- Immutable versions
- Version control systems
- Functional programming applications
Distributed B-Trees
- Sharding strategies
- Consistency protocols
- Replication techniques
Conclusion
B-trees are fundamental data structures optimized for external storage systems. Key takeaways:
Core Concepts:
- Multi-way trees with balanced height
- Optimized for disk I/O operations
- Maintain sorted order efficiently
Performance:
- O(log_m n) operations where m is degree
- Minimizes expensive disk accesses
- Better cache locality than binary trees
Applications:
- Database indexing systems
- File system organization
- Operating system components
For loksewa preparation, focus on:
- Understanding B-tree properties and constraints
- Implementing basic insertion with splitting
- Analyzing time complexity in terms of degree
- Comparing with other tree structures
- Recognizing when B-trees are appropriate
Remember: B-trees are designed for the reality of computer systems where memory hierarchy matters. The large degree isn't just theoretical - it's a practical optimization for disk-based storage where minimizing I/O operations is crucial for performance.
The key insight is that B-trees trade some complexity for dramatic improvements in real-world performance when dealing with large datasets that don't fit in main memory!