B* tree and red-black tree

Comprehensive study notes, diagrams, and exam preparation for B* tree and red-black tree.

B* Tree and Red-Black Tree

Definition

A B* tree is a balanced multiway search tree in which each node can store multiple keys and multiple children, and nodes are kept at a high occupancy level by redistributing or splitting keys with neighboring nodes. It is a variant of the B-tree that improves space utilization by ensuring nodes are more than half full, usually about two-thirds full.

A red-black tree is a self-balancing binary search tree in which each node has an extra color property, either red or black, and the tree maintains specific color-based rules that guarantee the path from root to leaf stays balanced enough for search, insertion, and deletion in O(log n) time.


Main Content

1. B* Tree Structure and Properties

Multiway nature and node capacity

A B* tree is not a binary tree. Each node can contain several keys and several child pointers. This makes the tree shorter in height compared to binary trees, which is very useful when data is stored on disk because fewer levels mean fewer disk reads. For example, if a node can hold many keys, a single disk block can store a large amount of sorted data.

High occupancy and better space utilization

Unlike a regular B-tree, a B* tree tries to keep nodes at least about two-thirds full. When a node is full during insertion, it first attempts to share keys with a neighboring sibling before splitting. This reduces wasted space and improves storage efficiency. For example, if one node is full and its sibling has room, the keys may be redistributed between them instead of immediately creating a new node.

ASCII illustration of a B* tree node idea

          [ 30 | 60 | 90 ]
         /      |      |      \
   [10|20]   [40|50] [70|80] [100|110]

This shows that one node can store multiple keys, and each child covers a key range.


2. Red-Black Tree Structure and Properties

Binary search tree with coloring rules

A red-black tree is a binary search tree, so every node has at most two children: left and right. The tree adds a color attribute to each node, red or black, to enforce balance. The color rules prevent the tree from becoming too skewed, which would otherwise make operations slow.

Balance through limited height difference

The color properties ensure that no path from root to leaf becomes excessively long compared to another path. Because of these rules, the longest path is at most about twice the shortest path, keeping the tree reasonably balanced. This guarantees logarithmic time complexity for search, insertion, and deletion.

Common red-black tree rules

  1. Every node is either red or black.
  2. The root is always black.
  3. All leaves (NIL or null sentinel nodes) are black.
  4. Red nodes cannot have red children.
  5. Every path from a node to its descendant NIL leaves has the same number of black nodes.

ASCII illustration of a red-black tree idea

          20(B)
         /     \
      10(R)    30(R)
      /  \      /   \
   5(B) 15(B) 25(B) 35(B)

B = black, R = red


3. Operations, Balancing, and Differences

Insertion and rebalancing in B* tree vs red-black tree

In a B tree, insertion begins by finding the correct leaf node. If the node has space, the key is inserted in sorted order. If the node is full, the tree first tries redistribution with a sibling. Only if redistribution is not possible does it split nodes, sometimes splitting two full nodes into three nodes in a B tree. In a red-black tree, insertion follows binary search tree rules first, then fixes color violations using recoloring and rotations.

Rotations and structural adjustments

Rotations are essential in red-black trees. A left rotation or right rotation changes the local shape of the tree while preserving the binary search tree order. These rotations are used when a red-red conflict occurs after insertion or when deletion causes imbalance. B* trees generally do not use rotations in the red-black sense; instead, they use redistribution, splitting, and merging of multi-key nodes.

Practical comparison of use cases

B* trees are ideal for database indexing, file systems, and large block-based storage because they reduce height and increase node occupancy. Red-black trees are ideal for in-memory data structures such as ordered maps, sets, scheduling systems, and compiler symbol tables because they provide fast dynamic operations with simpler binary structure.

Example of red-black insertion fix

Suppose inserting 15 into:

      10(B)
         \
         20(R)
        /
     15(R)

This creates a red-red violation. The tree is fixed by rotation and/or recoloring so that no red node has a red child.


Working / Process

1. B* Tree insertion process

  • Start from the root and move downward to the correct leaf by comparing keys.
  • Insert the key into the leaf in sorted order if there is space.
  • If the leaf is full, first try to redistribute keys with a sibling node.
  • If redistribution fails, split nodes and promote a separator key to the parent.
  • If the parent becomes full, the process may continue upward, and if needed, the root may split, increasing tree height by one.

2. Red-Black Tree insertion process

  • Insert the new key as in a normal binary search tree.
  • Color the new node red initially.
  • Check for violations of red-black rules, especially red parent-red child conflicts.
  • Fix violations using recoloring and rotations.
  • Ensure the root becomes black at the end.

3. Red-Black Tree deletion and B* Tree balancing

  • In a red-black tree, deletion is more complex because removing a black node can disturb the black-height property. The tree may need rotations and recoloring to restore balance.
  • In a B* tree, deletion may cause underflow if a node has too few keys. The tree handles this using redistribution with siblings or merging nodes.
  • Both structures restore balance after changes, but B* trees focus on node occupancy while red-black trees focus on color constraints and height control.

Advantages / Applications

Fast and efficient searching

Both trees provide logarithmic-time search, insertion, and deletion in the worst case, making them very useful for large datasets.

Balanced performance under updates

Unlike ordinary binary search trees, these structures do not degrade into a linked list when data is inserted in sorted order. They keep operations stable even under frequent updates.

Real-world usage in systems and software

B* trees are widely used in databases, indexing systems, and file systems because they reduce disk I/O and maximize storage block usage. Red-black trees are used in language libraries, memory-based indexes, operating systems, and applications requiring ordered data with frequent insertions and deletions.


Summary

  • B* tree is a highly space-efficient multiway balanced tree.
  • Red-black tree is a binary search tree balanced using color rules and rotations.
  • Both maintain logarithmic time operations.
  • Important terms to remember: B* tree, red-black tree, node occupancy, rotation, recoloring, redistribution, split, merge, black height.