Binary Search Trees

CMPT 225

Consider time complexity of operations ofr simple list + array implementations!

type insert find remove
un-ordered array (green) O(1) O(n) O(n)
ordered array (red) O(n) (purple outline) O(log n) (red) O(n)
un-ordered linked list (green) O(1) (red) O(n) (red) O(n)
ordered linked list (red) O(n) (red) O(n) (red) O(n)

Q: What will count as “fast”?

A: Time O(log n) //n is size of set

Implementations of these are simple abstractions to implementations of sets, which we focus on.

Binary Search Tree (B.S.T.)

A BST is:

Ex.

Example 1 (checked):

Example 2 (X):

Example 3 (check) (right and left are written explicitly when there are not two nodes to write. Otherwise, left is written first and right is listed second.):

Example 4 (check):

Every sub-tree of a BST is a BST

This makes recursive algorithms very nature.

Fact:

In-order traversal of a BST visits keys in non-decreasing order.

Proof sketch:

akkey(v)<b1a1a2akkeys(v)b1b2bm a_k \leq \text{key}(v) < b_1\\ \therefore a_1 \leq a_2 \leq \dots \leq a_k \leq \text{keys}(v) \leq b_1 \leq b_2 \dots \leq b_m

BST Find/Search: examples

(Trasncriber’s note: the links are the search path for the algorithms)

find(5):

find(1):

Find 6:

Some notation:

Suppose v is a node of BST. We write:

BSD find(x) Pseudo-code

find(x){// return true iff t is in the tree.
  return find(t,root)
}

find(t,v)// return true if t appears in ubstree rooted at v.
{
  if t < key(v) & v has a left subtree
    return find(t, left(v))
  if t > key(v) & v has a right subtree
    return find(t, right(v))
  if key(v) = t
    return true
  return false //v is a leaf, does not have t
}

BST find(t,v) pseudo-code – alternate version

find(t,v) // return true if t appears in subtree rooted at v
{
  if key(v)=t
    return true
  if t < key(v) & v has a left subtree
    return find(t,left(v))
  if t > key(v) & v has a right subtree
    return find(t,right(v))
  return false
}

Q: Which version is better?

A: key(v)=t will almsot always be false, so the first return should do fewer comparisons and usually be false.

BST insert(x) Pseudo-code

insert(t){
  // adds t to the tree
  // assumes t is not in the tree already*
  u <- node at which find(t,root) terminates**
  if t<key(u)
    give u a new left child with key t.
  else
    give u a new right child with key t.
}

* Excersise: Write the version that does not make this assumption.

** Excersise: Write the version where the search is excplicit.

BST Insert Examples

insert(1):

insert(7):

BST insert(x) Pseudo-code – explicit search version…

insert(t){ //adds t to the tree if it is not already there
  insert(t, root)
}
insert(t,v) //insert t in the subtree rooted at v, if it is not there
{
  if t < key(v) & v has a left subtree
    insert(t, left(v))
  if t > key(v) & v has a right subtree
    insert(t, right(v))
  if t < key(v) //here v has no left child
    give v a new left child with key t
  if t > key(v) //here v has no right child
    give v a new right child with key t.
  // if we reach here, t=key(v), so do nothing.
}

Insertion Over for BSTs: Examples

1)

2)

Notes

BST remove(t)

We consider 3 cases, increasing difficulty.

  1. Case 1: t is at a leaf (example figure #1):
    1. find the node v with key(v)=t
    2. delete v
  2. Case 2: t is a node with 1 child (example figure #2 and example figure #3)
    1. find the node v with key(v)=t
    2. let u be the child of v
    3. replace v with the subtree rooted at u
  3. For case 3, see the next section

Example Figure #1

remove(7):

Example Figure #2

remove(3)

step 1 (original)

step 2

step 3

Example Figure #4

remove(10)

step 1 (original)

step 2

step 3

BST remove: Case 3 Preperation: Successors

BST remove: Case 3 Preperation: Successorts in BSTs

BST remove: Case 3 Preperation: Successors

If node v has a right child, it is easy to find its successor: succ(v)\text{succ}(v) is the first node visited by an in-order traversal of the right subtree of v.

Ex. 6 diagrams. All of which give v a right subtree, one of one node, one of one node with a left child, one with a left leaf and right subtree of its own, and three variations on arbitrary numbers of children attached to the left node of v.

To find the successor of node v that has a right child, use:

succ(v){
  u<-right(v)
  while(left(u) exists){
    u<-left(u)
  }
  return u
}

BST remove(t)

Case 3: t is at a node with 2 children:

  1. find the node v with key(v)=t
  2. find the successor of v – call it u.
  3. key(v)<-key(u) //replace t with succ(t) at v.
  4. delete u:
    1. if u is a leaf, delete it.
    2. if u is not a leaf, it has one child w, replace u with the subtree rooted at w.

Notice: 4.1 is like case 1; 4.2 is like case 2.

BST remove(k) when node(k) has two children

Ex. to remove 5:

  1. Find 5
  2. Find successor of 5
  3. Replace 5 with its succ.
  4. In this example, succ(5) has no children so just delete the node where it was.

Example tree:

After switching 5 and succ(5):

(transcriber’s note: may be incorrect, but I’m writing what’s there)

Example tree 2:

To remove 6:

  1. Find 6
  2. Find successor of 6
  3. Replace 6 with its successor
  4. Replace succ(6) with its non-empty subtree

Tree:

Becomes, by step 4:

Complexity of BST Operations

Q: Can we always have short bushy BSTs?

T1 h=? h = ?

T2 hnh \cong n

Perfect Binary Tree

1 (yes):

2 (no):

3 (yes):

4 (yes):

5 (no):

6 (no):

Existance of Optimal BSTs

Claim: For every set S of n keys, there exists a BST for S with height at most 1+log2n1+\log_{2} n

Proof: Let h be the smallest integer s.t. 2hn2^{h} \geq n, and let m=2hm=2^{h}. So,

2hn>2h1log22hlog2n>log22h1hlog2n>h1h<1+log2n 2^{h} \geq n > 2^{h-1}\\ \log_{2} 2^{h} \geq \log_{2} n > \log_{2} 2^{h-1}\\ h \geq \log_{2} n > h-1\\ h < 1+\log_{2} n

let T be the perfect binary tree of height h

Label the first n nodes of T (as visited by an in-order traversal) with the keys of S, and delete the remaining ndoes (to get T1T^{1}).

T1T^{1} is a BST for S with height h<1+log2nh< 1+\log_{2} n

So, there is always a BST with height O(logn)O(\log n).

Optimal BST Insertion Order

Given a set of keys, we can insert them so as to get a minimum height BST:

Consider:

Graph of a perfect tree, with height of 4. Every node has two children, except for the 8 leafs.

What can we say about the key at the root? It is the median key.

Observe: the first key inserted into a BST is at the root forever (unless we remove it from the BST).

Given a set of keys, we can insert them to get a minimum height BST:

(transcriber’s note: I may have done this wrong, the drawing of the following tree is very weird.)

* apply the “root is the median key” principle to each subtree.

So, there is always a BST with height logn\cong\log n

Can we maintain min. height with O(logn)O(\log n) as we insert and remove keys?

Consider A:

insert(1) would make it become B:

End (transciber’s note: not the end)

(some repeated slides and graphics)

Notice:

Because a perfect binary tree of height h has:

Then: 2h+2h1=2×2h1=2h+112^{h} + 2^{h}-1 = 2\times 2^{h}-1 = 2^{h+1}-1

Actual end