AVL Trees – CMPT 225
Recall: A BST is:
- a binary tree
- with nodes labled by keys
- for every two nodes u,v:
- if u is in the left subtree of v, then key(u) < key(v)
- if u is in the right subtree of v, then key(u) > key(v)
-
BST operations take time proportional to the tree height, which might be the same as the number of keys.
- AVL Trees are a kind of “self-balancing” BST. Tree height is always at most where n is number of keys.
- An AVL Tree is a BST that satisfies the following height-balance invariant:
-
for every node v: $$ \text{height}(\text{left}(v)) - \text{height}(\text{right(v)) \leq 1$$ #TODO FIX - (We define height(left(v)) = -1 if left(v) does not exist, similarly for right.
-
- Implementing the Operations:
- Perform BST operations, then
- repair balance if needed.
How unbalanced can an AVL Tree be?
Ex. A “Maximally unbalanced” height - 5 AVL Tree
-
root node
-
left subtree (7 nodes)
-
node
-
node
- node (left)
- node
-
node
-
node
- node (left)
-
node
-
right subtree (31 nodes)
-
node
-
node
-
node
- node
- node
-
node
- node
- node
-
node
-
node
-
node
- node
- node
-
node
- node
- node
-
node
-
node
-
node
-
node
-
node
- node
- node
-
node
- node
- node
-
node
-
node
-
node
- node
- node
-
node
- node
- node
-
node
-
node
-
node
-
left subtree (7 nodes)
How tall can an AVL Tree be?
Let N(h) = min # of nodes in an AVL tree of height h.
Observe:
- …
- (if H is even, we end when )
Claim:
Proposition:
- Pf.:
- By ind. on h.
- Basis: (checkmark); (checkmark).
- Assume, for some , that
- Now (checkmark)
- So:
- We have: for every AVL tree with n nodes and height h,
- Thus: AVL Tree search takes time that is
Max AVL Tree Height vs BST height
(Worst case # of nodes visited by AVL-tree vs BST search)
Unbalanced subtrees are “repaired” using rotations
-
5
-
3
- 1
- 4
-
7
- 6
- 8
-
3
Can be converted to the following using a right rotation at node with 5. Can be converted from the following using a left rotation at the node with 3.
-
3
- 1
-
5
- 4
-
7
- 6
- 8
AVL Tree insertion:
- Do BST insertion.
- If there is an unbalanced node,
- let v be the unbalanced node of greatest depth*
- repair the imbalance at v.
Consider 4 cases (w is new node, v is unbalanced, k is height of tree, h is height of any given subtree):
2 outside cases:
Case 1:
-
v (root; h=k)
-
left (h=k-1)
-
left left (h=k-2)
-
...
- w
-
...
-
left right (h=k-2)
- ...
-
left left (h=k-2)
-
right (h=k-2)
- ...
-
left (h=k-1)
Case 2:
-
v (root)
-
left
- ...
-
right
-
right left
- ...
-
right right
-
...
- w
-
...
-
right left
-
left
Case 3:
-
v (root)
-
left
-
left left
- ...
-
left right
-
...
- w*
-
...
-
left left
-
right
- ...
-
left
Case 4:
-
v (root)
-
left
- ...
-
right
-
right left
-
...
- w*
-
...
-
right right
- ...
-
right left
-
left
* It msut be on the path from the new leaf to the root.
To fix the “ouside” cases:
Do 1 rotation at the unbalanced node.
-
v (root; h=k)
-
u (h=k-1)
-
left left (h=k-1)
-
T1 (h=h-2)
- w
-
T1 (h=h-2)
- T2 (h=k-2)
-
left left (h=k-1)
- T3 (h=k-2)
-
u (h=k-1)
After rotation:
-
u (root; h=k)
-
T1 (left; h=k-2)
- w
-
v (right; h=k-1)
- T2 (right left; h=k-2)
- T3 (right right; h=k-2)
-
T1 (left; h=k-2)
* The final height of u is k, so the tree is now balanced.
Excersizes:
- Draw the right-right case in detail.
- Draw them with minimal sized T1,T2,T3
The “inside cases” are not fixed by this rotation:
-
v (root)
-
node (left; h=k-1)
- T1 (left left; h=k-2)
-
T2 (left right; h=k-2)
- w
- T3 (right; h=k-2)
-
node (left; h=k-1)
Would become, with a right rotation:
-
v (root)
- T1 (left; h=k-2)
-
node (right; h=k)
-
T2 (right left; h=k-2)
- w
- T3 (right right)
-
T2 (right left; h=k-2)
(Transcriber’s note: w is now used as a variable by the slides. w will no longer represent the asdded node. This will be written down in full as “new node”.)
To fix the “inside” cases, we use two rotations:
-
v (h=k; v is the unbalanced node of any depth; label=a)
-
u (h=k-1; label=b)
- T1 (h=k-2)
-
w (h=k-2; label=w)
-
T2 (h=k-3)
- Insertion here or... one other place
-
T3 (h=k-3)
- Other possible insertion place.
-
T2 (h=k-3)
- T4 (h=k-2)
-
u (h=k-1; label=b)
Left rotation at b:
-
a
-
c (h=k)
-
b
- T1
-
T2
- possible insertion
-
T3 (h=k-3)
- possible insertion (not part of height)
-
b
- T4 (h=k-2)
-
c (h=k)
After 1 rotation, this is too tall (like outisde case
Right rotation at a:
-
c
-
b (h=k-1)
- T1 (h=k-2)
-
T2 (h=k-2)
- possible insertion point
-
a (h=k-1)
-
T3 (h=k-2)
- possible insertion
- T4 (h=k-2)
-
T3 (h=k-2)
-
b (h=k-1)
The entire operation is:
- left(c) <- b
- right(c) <- a
- left(a) <- T3
- right(b) <- T2
- Change parent(a) to be parent(c)
(#1 rotation = 3 assignments; 2 rotations = 6 assignments; double rot = 5 assignments.)
AVL Tree Removal
- Do BST Removal.
- Rebalance.
Define “the parent of the unbalanced node” (*) by cases:
- The deleted key was at a leaf.
- The deleted key was at a node with one child.
- The deleted key is at a node with 2 children.
Case 1 graphs (X is removed node; p is parent of removed node):
-
p
- X (right)
Becomes:
- p
Graphs for case 2:
-
p
-
X (right)
- ... (right)
-
X (right)
Becomes:
-
p
- ... (right)
Graphs for case 3:
-
X
- ...
-
node
-
p (left)
- node (arrow to X)
-
p (left)
Or:
-
X1
- ...
-
node
-
p (left)
-
X2 (arrow to X1)
- ... (right)
-
X2 (arrow to X1)
-
p (left)
Fact: After doing a BST removal in an AVL tree, there is at most 1 unbalanced node, and it is on the path from the parent of the deleted node to the root.
(If the deleted node was the root, the “parent of the deleted node” does not exist–but also there can be no unbalanced node)
Consider:
-
root
-
...
-
...
- continues on with no detail
-
o (orange)
- continues on with no detail
-
node
-
b (blue; left)
-
m (mauve)
- continues on with no detail (right)
- continues oN with no detail
-
m (mauve)
-
b (blue; left)
- continues on with no detail
-
...
- continues on with no detail
-
...
Becomes:
-
root
-
... (g; green)
-
... (r; red)
- continues on with no detail
-
o (orange)
- continues on with no detail
-
node
-
b (blue; left)
- continues on with no detail
- continues oN with no detail
-
b (blue; left)
- continues on with no detail
-
... (r; red)
- continues on with no detail
-
... (g; green)
Terms:
- o, orange: Node with key to be deleted.
- m, mauve: Deleted node
- b, blue: Parent of deleted node
- r, red: Unbalanced node
- g, green: This node, for example, cannot be unbalanced.
An AVL tree removal that illustrates:
- Need to re-balance after removal
- Re-balancing node u may reduce the height of a subtree, resulting in an ancestor of u being unbalanced.
remove(14):
-
10 (h=3)
-
4
- 2
-
7
- 5 (left)
-
12
- 14 (removed; right)
-
4
left at 4:
-
19
-
7
-
4 (left)
- 2
- 5
-
4 (left)
- 12
-
7
right at 10:
-
7
-
4
- 2
- 5
-
10
- 12 (right)
-
4
Abstract version:
-
node
-
node
- ...
-
...
- 5 (left)
-
...
- 14 (right; deleted)
-
node
Rebalance (for deletion):
- w <- parent of deleted node, if it exists
- for (each node on path from ){
- if u is unbalanced
- Let T be the subtree rooted at u
- rebalance T using suitable rotations*
- if height of T did not get smaller, return
- if u is unbalanced
- }
* either a single or a double rotation, based on case diagrams similar to that used for insertion.
Correctness of the algorithm involves two properties:
- There is at most 1 unbalanced node after deletion.
- Rebalancing w may make an ancestor of w unbalanced.
Complexity of AVL tree operations
- Every AVL tree with n nodes has height .
- The worst case amount of work for main opperations is:
- search:
- one traversal from root to leaf:
- insert:
- two traversals from root to leaf (down & back up):
- two rotations:
- remove:
- two traversals from root to leaf (down & back up):
- at most, two rotations at each node on that path: .
- search:
All three major operations in time.