Data Structure – Search Tree and Multiway Tree
Search Tree and Multiway Tree Binary Search
Searching for X in Binary Tree L. Linear search: O(N)
Better way if L is sorted: Compare X to the middle value (M) in L. if X = M we are done. if X < M we continue searching in 1st half of L only. if X > M we continue searching in 2nd half of L only. Example: L = 1 3 4 6 8 9 11. X = 4. X < 6: Repeat with L = 1 3 4. X > 3: Repeat with L = 4. X = 4: We found X! This is binary search: each iteration halves the length of the list. Therefore, total number of iteration <= logN. Difference between O(logN) and O(N) is very significant for large N. N = 2 billion, linear search: 1 billion comparisons, binary search: 32 comparisons.
Implementation of Binary Search
Need a way to find the middle value efficiently. A linked list is no good for this. Design using L = 1 3 4 6 8 9 11.
The 1st thing we need is a pointer to the middle element: 6. What should 6 point to? There are 2 possible outcomes that involve further search: X < 6 and X > 6. In the second level, similarly, similarly, there are 2 possible outcomes for 6 and 9, respectively: 1
Data Structure – Search Tree and Multiway Tree
Binary tree such that for each node: everything in left subtree is < value everything in right subtree is > value
Binary search works perfectly if lists are implemented with arrays
Search in Binary Search Tree (BST) L:
Compare X to the root (i.e. middle value) (M) in L. if X = M we are done. if X < M we recursively search for X in L's left subtree. if X > M we recursively search for X in L's right subtree. Example: search for 5 in earlier tree. It is easy to traverse a BST in increasing increa sing order. From the node node containing the value N process all values smaller than N - i.e. traverse N's left subtree. process N itself. process all values bigger than N - i.e. traverse N's right subtree. This is left-to-right infix traversal. How to traverse a BST in decreasing order? Answer: right-to-left infix traversal.
Operations on Binary Search Tree:
BSTs are binary trees, so all the operations we've defined for binary trees can be applied to BSTs. Insert a node into a BST:
The general pseudocode for BSTINSERT(V BSTINSERT(V,T) ,T) (assume V is not in i n T): BSTINSERT(V,T) BSTINSERT(V,T) { if T is empty then T = create_singleton(V) else if V > rootvalue(T) then if T's right subtree exists then BSTINSERT(V,T's right subtree) else T's right subtree = create_singleton(V) else if T's left subtree exists then BSTINSERT(V,T's left subtree)
2
Data Structure – Search Tree and Multiway Tree
Delete a node from BST:
General algorithm: to delete N, if it has 2 subtrees, replace the value in N with the largest value in its left subtree and then delete the node with the largest value from its left subtree. Note: the largest value has at most one subtree. Why? It is the largest value: it doesn't have a right subtree.
Time to search a BST:
Time to search a BST is limited by its height. Each step goes down one level. Therefore, worst case: O(Height). Question: is Height O(logN)? Unfortunately no! Consider the following case
Balanced Trees
Perfectly height balanced:
Practical alternative: within 1
3
Data Structure – Search Tree and Multiway Tree
How to tell if a tree is height balanced?
AVL Trees
Search, Insertion, Deletion are guaranteed O(logN) provided height balance can be maintained in O(logN). Operations must preserve:
heights of left and right subtrees are within 1.
values in left subtree are smaller than root value.
values in right subtree are bigger than root value.
Insertion:
BST insertion
check height-balance, and rebalance if necessary by changing the tree's shape.
Example: Insert 3, 6, 2, 1, 0 to the tree
4
Data Structure – Search Tree and Multiway Tree
Insert 3
Insert 6
Insert 2
Insert 1
Insert 0
Imbalance, one tree is tall and the other tree is short.
Rotation Algorithm
Pivot : deepest imbalanced node. Rotator : root of pivot's taller subtree.
5
Data Structure – Search Tree and Multiway Tree
Step 4: Join inside subtree to pivot in place of rotator. Step 5: Join pivot to rotator in place of inside subtree. Step 6: Join rotator to pivot's original parent in place of pivot.
Example
After steps 1-3
6
Data Structure – Search Tree and Multiway Tree
Step 4
Step 5
Step 6
Example
Insert 3
7
Data Structure – Search Tree and Multiway Tree
Rotate
This is just as bad! What has gone wrong? Rotation: inside stays at same level, outside moves up 1 level. If inside is taller, rotation won't fix imbalance.
Solution: first rotate inside subtree to the outside. After Insert 3
Rotate Inside
Main Rotation
General Insertion Algorithm
1.
Do a normal BST insert.
2.
If tree is balanced, stop.
3.
Starting from new node, find first imbalanced imbalanced ancestor. ancestor. This node becomes becomes the pivot; its taller subtree is the rotator. 8
Data Structure – Search Tree and Multiway Tree
4.
The new new value was inserted in one of the the rotator's subtrees (the taller one).
If in rotator's outside subtree: 1 rotation will suffice.
In in rotator's inside subtree: first rotate that subtree, then perform main rotation.
Example
Insert 2
The new value is in the rotator's inside subtree. Rotate inside
9
Data Structure – Search Tree and Multiway Tree
Main rotation
Deletion
Given a value X and an AVL tree T, delete the node containing X and rebalance the result.
Deleting X upsets A.
After rotating B up, E is upset.
After rotating F up, D's left subtree is shorter. D is upset.
After rotating M up (D down), we get a 5-5 balance at D. So now it is 1 shorter and its parent is out of balance.
This can propagate all the way to the top of the tree.
M-way Search Sea rch Tree
Binary search tree: 1 value and 2 subtrees per node. M-way search tree: M - 1 values and M subtrees per node. M is the degree of the tree.
10
Data Structure – Search Tree and Multiway Tree
In fact, each may contain up to M - 1 values. A node with k values must have k + 1 subtrees. In a node, values are stored in ascending order: V1 < V2 < ... < Vk The subtrees are placed between adjacent values: each value has a left and right subtree. V(i)'s right subtree = V(i+1)'s left subtree. All the values in V(i)'s left subtree are < V(i). All the values in V(i)'s right subtree are > V(i).
Search in M-way Trees
Searching for X: 1.
If X < V(1), recursively search in V(1)'s left subtree. subtree.
2.
If X > V(k), recursively search in V(k)'s right subtree.
3.
If X = V(i), for some i, X is found!
4.
Else, for some i, V(i) < X < V(i+1); recursively search in subtree between V(i) and V(i+1).
Search for 68 in:
11
Data Structure – Search Tree and Multiway Tree
B-Trees
A B-tree is a M-way search tree such that:
It is perfectly balanced.
Every node, except perhaps the root, is at least half full (has >= M values).
This tree is not a B-tree:
This one is, and contains the same values:
Insertion into a B-Tree
To insert X: 1.
Use search procedure procedure to find leaf node where X should be added.
2.
add X to this node at the the appropriate appropriate place among the values.
3.
if there are <= M-1 values, we are done! Otherwise, the node has overflowed. To To repair, split the node into 3 parts:
Left: first (M-1)/2 values
Middle: value at position 1+(M-1)/2
Right: last (M-1)/2 values
Left and Right have just enough values: make them into nodes. They become the left and right children of Middle, which we add in the appropriate place in this node's parent. If the parent overflows, we repeat the procedure. If the root overflows: we create a new root with Middle as its only value and Left and Right as its children. 12
Data Structure – Search Tree and Multiway Tree
Example:
Insert 17
Insert 6. Split: Left=[2,3], Middle=5, Right=[6,7]
Insert 21. Split: Left=[17,21], Middle=22, Right=[44,45]
13
Data Structure – Search Tree and Multiway Tree
Insert 67. Split: Left=55,66], Middle=67, Right=[68,70]
Overflow. Overflow. Split: Left=[5,10], Middle=22, Right=[50,67]
The tree-insertion algorithms we have previously seen add new nodes at the bottom of the tree, and then have to worry about whether they have created an imbalance. The B-tree insertion algorithm is just the opposite: it adds nodes at the top. All nodes become 1 level deeper: the tree remains balanced. Deletion from a B-Tree
Recall: in a BST, if the value to be deleted does not occur in a leaf, we replace it with the largest value from its left subtree, and then delete that value from the left subtree. We proceed similarly in a B-tree. B-tree . Furthermore, the largest value in a left subtree is guaranteed to be in a leaf node.
14
Data Structure – Search Tree and Multiway Tree
To delete X from a leaf node: 1.
Remove X from current node. node. There There are no subtrees to worry about.
2.
If >= (M-1)/2 values, Done! Else, node underflowed and needs repair.
How to repair a non-root node? Consider deleting 6 from:
The leaf node now contains just 7. Repair strategy: try to borrow values from a neighbouring node. We join together the t he current node and a neighbour to form a combined node. Don't forget to include the value between these 2 adjacent subtrees. We choose to join [7] with [17,22,44,45]: [6,7,10,17,22,44,45]
parent contributes 1 value.
node that underflowed: (M-1)/2 - 1 values.
neighbour: between (M-1)/2 and (M-1) values.
We distinguish 2 cases, depending on whether the neighbour contributes exactly (M-1)/2 values or more.
Case 1:
neighbour contributes > (M-1)/2 values.
The combined node contains > 1+((M-1)/2 - 1) + (M-1)/2 values, i.e. > M - 1. It is too big! Split the combined node into: Left, Middle and Right. Since there were >= M values, minus Middle, leaves >= M - 1, split in 2, leaves >= (M-1)/2. Therefore, Left and Right have enough values to become nodes. Replace the value we borrowed from the parent with Middle, using Left and Right as its 2 children. Since parent's size doesn't change: we are done!
15
Data Structure – Search Tree and Multiway Tree
Case 2:
neighbour contributes exactly (M-1)/2 values.
The combined node contains 1 + ((M-1)/2 - 1) + (M-1)/2 = M-1 values. It can become a valid node. Simply erase borrowed value and neighbour from parent, and replace node that underflowed with the new combined node. Delete 3 from:
Result:
Note: the parent has 1 fewer value. It might underflow! The repair strategy may have to be applied repeatedly at successive levels.
If the root underflows: it must have originally contained just 1 value, now removed.
if the root was a leaf: now empty
else, value was consumed by case 2: the resulting combined node can be used as new root.
16
Data Structure – Search Tree and Multiway Tree
Delete 7 from:
Root underflows:
17