A B-Tree is a self-balancing search tree that maintains sorted data and allows searches, insertions, deletions, and sequential access in logarithmic time. B-Trees are widely used in databases and file systems because of their ability to handle large amounts of data efficiently.
Key Features of B-Trees
- Balanced Tree:
- Every path from the root to a leaf has the same length, ensuring balanced access times.
- Multiple Keys per Node:
- Each node can store multiple keys, reducing the height of the tree.
- Dynamic Growth:
- Automatically adjusts the number of keys and nodes during insertions and deletions.
- Efficient Disk Access:
- B-Trees minimize disk I/O by grouping multiple keys into each node.
Terminology
- Order (
m):- The maximum number of children a node can have. A B-Tree of order
mcan have up tom-1keys in each node.
- The maximum number of children a node can have. A B-Tree of order
- Root Node:
- The topmost node of the tree.
- Internal Nodes:
- Nodes that have children.
- Leaf Nodes:
- Nodes with no children.
Properties of a B-Tree
- Each node can hold at most
m-1keys and must hold at leastceil(m/2) - 1keys (except the root). - The keys within a node are sorted in ascending order.
- Internal nodes act as guides, dividing the data space:
- For a key
kin a node:- Keys in the left child are
< k. - Keys in the right child are
> k.
- Keys in the left child are
- For a key
Basic Operations on B-Trees
1. Search
The search operation in a B-Tree is similar to binary search:
- Compare the key with the keys in the current node.
- If the key matches one of the keys in the node, the search is successful.
- If not, move to the appropriate child node based on the key’s value.
2. Insertion
Insertion involves finding the correct position for the key and ensuring that the B-Tree properties are maintained:
- Insert the key into the appropriate node.
- If the node overflows (has
mkeys), split it into two nodes:- The middle key becomes the parent key in the node’s parent.
- Repeat this process up to the root if necessary.
3. Deletion
Deletion can be more complex, as it may require re-balancing:
- If the key is in a leaf node, simply remove it.
- If the key is in an internal node:
- Replace it with its predecessor or successor key.
- Recursively delete the predecessor or successor key.
- If a node underflows (has fewer than
ceil(m/2) - 1keys):- Borrow a key from a sibling, or
- Merge with a sibling.
Example of a B-Tree
Inserting Keys into a B-Tree of Order 3
- Start with an empty tree:
[] - Insert keys:
10, 20, 30, 40, 50:- After inserting
10,20, and30:[10, 20, 30] - Inserting
40causes a split:[20] [10] [30, 40] - Inserting
50:[20] [10] [30, 40, 50]
- After inserting
- Insert
60, causing another split:[20, 40] [10] [30] [50, 60]
C++ Implementation of a B-Tree
Below is a simplified implementation of a B-Tree:
#include <iostream>
#include <vector>
using namespace std;
// B-Tree Node Structure
class BTreeNode {
public:
vector<int> keys;
vector<BTreeNode*> children;
bool isLeaf;
BTreeNode(bool leaf) : isLeaf(leaf) {}
void insertNonFull(int key);
void splitChild(int i, BTreeNode* child);
void traverse();
BTreeNode* search(int key);
friend class BTree;
};
// B-Tree Structure
class BTree {
BTreeNode* root;
int order; // Maximum number of children
public:
BTree(int t) : root(nullptr), order(t) {}
void traverse() {
if (root) root->traverse();
}
BTreeNode* search(int key) {
return (root == nullptr) ? nullptr : root->search(key);
}
void insert(int key);
};
void BTreeNode::traverse() {
int i;
for (i = 0; i < keys.size(); i++) {
if (!isLeaf) {
children[i]->traverse();
}
cout << keys[i] << " ";
}
if (!isLeaf) {
children[i]->traverse();
}
}
BTreeNode* BTreeNode::search(int key) {
int i = 0;
while (i < keys.size() && key > keys[i]) {
i++;
}
if (i < keys.size() && keys[i] == key) {
return this;
}
if (isLeaf) {
return nullptr;
}
return children[i]->search(key);
}
void BTree::insert(int key) {
if (!root) {
root = new BTreeNode(true);
root->keys.push_back(key);
} else {
if (root->keys.size() == order - 1) {
BTreeNode* newRoot = new BTreeNode(false);
newRoot->children.push_back(root);
newRoot->splitChild(0, root);
int i = (newRoot->keys[0] < key) ? 1 : 0;
newRoot->children[i]->insertNonFull(key);
root = newRoot;
} else {
root->insertNonFull(key);
}
}
}
void BTreeNode::insertNonFull(int key) {
int i = keys.size() - 1;
if (isLeaf) {
keys.push_back(0);
while (i >= 0 && keys[i] > key) {
keys[i + 1] = keys[i];
i--;
}
keys[i + 1] = key;
} else {
while (i >= 0 && keys[i] > key) {
i--;
}
i++;
if (children[i]->keys.size() == order - 1) {
splitChild(i, children[i]);
if (keys[i] < key) {
i++;
}
}
children[i]->insertNonFull(key);
}
}
void BTreeNode::splitChild(int i, BTreeNode* child) {
int mid = (order - 1) / 2;
BTreeNode* sibling = new BTreeNode(child->isLeaf);
sibling->keys.assign(child->keys.begin() + mid + 1, child->keys.end());
child->keys.resize(mid);
if (!child->isLeaf) {
sibling->children.assign(child->children.begin() + mid + 1, child->children.end());
child->children.resize(mid + 1);
}
children.insert(children.begin() + i + 1, sibling);
keys.insert(keys.begin() + i, child->keys[mid]);
}
int main() {
BTree tree(3);
tree.insert(10);
tree.insert(20);
tree.insert(5);
tree.insert(6);
tree.insert(12);
tree.insert(30);
tree.insert(7);
tree.insert(17);
cout << "Traversal of the tree: ";
tree.traverse();
cout << endl;
int key = 6;
if (tree.search(key)) {
cout << key << " is found in the tree." << endl;
} else {
cout << key << " is not found in the tree." << endl;
}
return 0;
}
Output
Traversal of the tree: 5 6 7 10 12 17 20 30
6 is found in the tree.
Applications of B-Trees
- Databases: Efficient indexing and retrieval of records.
- File Systems: Managing hierarchical file structures.
- Search Engines: Storing and retrieving search terms.
