Priority Queue & Heaps

PriorityQueue ADT (PQ)

Stores a collection of pairs (item, priority)
Priorities are from some ordered set. For simplicity, we use priorities from 0,1,2,… with 0 “highest priority”.
Main operations:
- insert(item, priority); adds item with priority `priority.
- extract_m/n(); removes (& returns) item with least priority.
- update(item, priority); changes priority of item to priority.
We want a data structure to implement efficient PQs. (e.g. O(log n) time for all operations.
We (again) will use a particular kind of tree.

Level - Order Traversal of ordered binary trees.

visits each node of the tree once.
visits every node at depth i before any node at depth i+1*.
visits every depth-d descendants of left(v) before any depth-d descendant of right(v).

Diagrams

Order of traversal Diagram 1:

1
- 2
  - 4
    - 8
    - 9
  - 5
    - 10
    - 11
- 3
  - 6
    - 12
    - 13
  - 7
    - 14
    - 15

Order of traversal diagram 2:

1
- 2
  - 4
    - 8
      - 14
      - 15
    - 9
  - 5
    - 10 (left)
      - 16 (left)
        
        20
        
        21
- 3
  - 6
    - 11
      - 17
      - 18
        
        22 (right)
    - 12
  - 7
    - 13 (right)
      - 19

* in some tests, it is bottom-up, not top-down.

Complete Binary Tree

A complete binary tree of height h is:

A binary tree of height h;
with $2^{d}$ nodes at depth d, for every $0 \leq d < h$
level order traversal visits every internal node before any leaf
every internal node is proper*, except perhaps the last**, which may have just a left child.

Diagrams

Example 1: X (4)

root
- child (right)

Example 2: checkmark

root
- child
- child

Example 3: X (3)

root
- child
- child
  - grandchild (left)

Example 4: X (4)

root
- child
  - grandchild (left)
- child
  - grandchild (left)

Example 5: checkmark

root
- child
  - grandchild (left)
- child

Example 6: checkmark

root
- child
  - grandchild
  - grandchild
- child

Example 7: checkmark

root
- child
  - grandchild
  - grandchild
- child
  - grandchild (left)

Example 8: X (5)

root
- child
  - grandchild
  - grandchild
    - great granchild
    - great granchild
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild

Example 9: X (4)

root
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild
    - great granchild (left)
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild

Example 10: checkmark

root
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild
- child
  - grandchild
  - grandchild

Example 11: checkmark

root
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild
    - great granchild
    - great granchild
- child
  - grandchild
    - great granchild
    - great granchild
  - grandchild
    - great granchild
    - great granchild

Unlabled tree on next slide:

...
- node at arbitrary depth
  - child node
  - child node
- node at arbitrary depth
  - child node
  - child node
- node at arbitrary depth
  - child node
  - child node
- node at arbitrary depth
  - child node (left)

Binary Heap Data Structure

a complete binary tree (“shape invariant”)
with verticies labled by keys (that is: priorities) from some ordered set,
s.t. $\text{key}((v) \geq \text{key}(\text{parent}(v))$ for every node v. (“order invariant”)

Example (checkmark:

1
- 3
  - 6
    - 7
  - 9
- 2
  - 5
  - 8

Example (X):

1
- 3 (highlighted arrow to 2)
  - 2
    - 7
  - 9
- 6 (highlighted connection to 5)
  - 5
  - 8

This is the basic DS for implementing PQs (binary min-heap).

How do we implement the operators so that invariants are maintained?
Consider Insertion: If we want to insert 14 into the heap, where should it go?

root (complete tree)
- ... (anbiguous number of node/depth)
  - 10
    - 12
      - ...
    - 20
      - ...
- ... (anbiguous number of node/depth)
  - 11
    - 13
      - ...
    - 19
      - ...

Notice: there no choice about how the shape changes:

Example 1:

root
- child
  - grandchild
    - inserted node (left)
  - grandchild
- child
  - grandchild
  - grandchild

Example 2:

root
- child
  - grandchild
    - great grandchild
    - great grandchild
  - grandchild
    - great grandchild
    - inserted node
- child
  - grandchild
  - grandchild

Example 3:

root
- child
  - grandchild
    - great grandchild
    - great grandchild
  - grandchild
    - great grandchild
    - great grandchild
- child
  - grandchild
    - inserted node (left)
  - grandchild

Heap Insert

To insert an item with key k:

add a new leaf v with key(v)=k, so as to maintain the shape invariant.
re-establish the order invariant by executing percolate_up(v).

Code:

percolate_up(v){
  while(v is not root and key(v) < key(parent(v))) {
    swap positions of v and parent(v) in the tree
  }
}

Insert 2, then 4, then 3 into:

Original:

1
- 5
  - 7
    - 12
    - 4
  - 9
- 6
  - 10
  - 8

Insert 2:

1
- 2 (5 is crossed out)
  - 7
    - 12
    - 4
  - 5 (9,2 are crossed out, arrow pointing to parent)
    - 9 (2 is coressed out, arrow pointing to parent, left)
- 6
  - 10
  - 8

insert 4:

1
- 2
  - 7
    - 12
    - 4
  - 4 (5 is corssed out, arrow pointing to parent)
    - 9
    - 5 (4 is coressed out, arrow pointing to parent)
- 6
  - 10
  - 8

Insert 3:

1
- 2
  - 7
    - 12
    - 4
  - 4
    - 9
    - 5
- 3 (6 is crossed out, double sided arrow to/from parent)
  - 6 (10, 3 are crossed out, double sided arrow to/from parent)
    - 10 (3 is corssed out, double sised arrow to/from parent, left)
  - 8

Becomes:

1
- 2
  - ...
- 3
  - 6
    - 10 (left)
  - 8

Heap Extract-Min:

Consider (need result of dot dot dots):

5
- ... (left)
- ... (right)

We must replace the root with the smaller of its children:

Diagram labled “OK”:

?
- 6 (arrow towards root)
  - 10
  - 12
- 7

Diagram labled “NOT OK”:

?
- 7
  - 10
  - 12
- 6 (arrow towards root)

Heap Extract-Min

To remove the (item with the) smalled key form the heap:

rename the root
replace the root with the “last leaf”, so as to maintain the shape invariant.
restore the order invariant by calling percolate_down(root)

Percolate_down is more work than percolate_up, because it must look at both children to see what to do (and the children may or may not exist)

Code:

percolate_down(v){
  while(v has a child c with key(c) < key(v)){
    c <- child of v with the smallest key among the children of v.
    swap v and c in the tree
  }
}

Notice that:

v may have 0, 1 or 2 children
if v has 2 children, we care about the one with the smallest key.

Do extract-min 3 times

Original:

1
- 6
  - 12
  - 8
- 4
  - 11
  - 7

First extract-min:

4 (7, 1 are crossed out)
- 6
  - 12
  - 8
- 7 (4 crossed out)
  - 11
  - (7 crossed out)

Second extract-min:

6 (4, 11 are crossed out)
- 8 (6, 11 are crossed out)
  - 12
  - 11 (8 is crossed out)
- 7
  - (11 crossed out, left)

Third extract-min:

7 (6, 11 are crossed out)
- 8
  - 12
  - (11 is crossed out)
- 11 (7 is crossed out)

Final form:

7
- 8
  - 12 (left)
- 11

Complexity of Heap Insert & Extract-min

Claim: Insert & Extract-min take time O(log n) for heaps of size n.
Recall: A perfect binary tree of height h has $2^{h+1}-1$ nodes.
P.f.: By induction on h (or “the structure of the tree”).
- Basis: If h=0 then we have $2^{0-1} -1 = 1$ nodes. (checkmark)
- I.H.: Consider some $h\geq 0$ and assume the perfect binary tree of height h has $2^{h+1} -1$ nodes.
- I.S.: show the p.b.t. of height h+1 has $2^{(h+1)+1}-1$ $2^{(h + 1) + 1} - 1$ nodes.
  - The tree is: diagram of tree with left/right being of height h, and left/right plus the parent is h+1.
  - So it has $2^{h+1} -1 + 2^{h+1} -1 +1 = 2 \times 2^{h+1}-1 = 2^{(h+1)+1}-1$ nodes. (circle with line through it)

Size bounds on complete binary trees

Every complete binary tree with height h and n nodes satisfies: $2^{h} \leq n \leq 2^{h+1}-1$ $2^{h} \leq n \leq 2^{h + 1} - 1$
- Smallest: (diagram of p.b.t. with height h and one node attached in the farthest left); #nodes = $2^{(h+1)+1}-1+1 = 2^h$
- Largest: (diagram of p.b.t. with height h fully filled)
So, we have:

$\begin{aligned} 2^{h} & \leq n\\ \log_{2} 2^{h} & \leq \log_{2} n\\ h & \leq \log_{2} n\\ h & = O(\log n) \end{aligned}$

Heap insert & extract min take time O(log n)

2 (left, right, parent is null)
- 4 (left, right, parent)
  - 7 (left=null, right=null, parent)
  - 8 (left=null, right=null, parent)
- 3 (left, right=null, parent)
  - 5 (left=nill, right=null, parent)

Node:

data
left
right
parent

Array-Based Binary Heap Implementation

Uses this embedding of a complete binary tree of size n in a size-n array:

Tree version:

0
- 1
  - 3
    - 7
    - 8
  - 4
    - 9
    - 10
- 2
  - 5
    - 11
    - 12 (inserted)
  - 6

Becomes, array version:

0
1
2
3
4
5
7
8
9
10
11
(inserted) 12

ith ndoe in level-order traversal becomes ith array element.

Children of node i are nodes 2i+1 & 2i+2
Parent of node i is node $\lfloor (i-1)/2\rfloor$
0 arrow 1, arrow 2
1 arrow 1 arrow 2
2 arrow 1 arrow 2
3 arrow 1 arrow 2
4
5
6
7
8
9
10
11
12

* growing and shrinking the tree is easy in the array embedding.

Partially-filled Array Implementation of Binary Heap: Insert

Original:

2
- 7
  - 8
  - 10
- 6
  - 9 (left)

equals:

2 left, right
7 left, right
6
8
10
4

Insert 1:

2
- 7
  - 8
  - 10
- 6
  - 9
  - (inserted) 1

array implementation:

2 (left, right)
7 (left, right)
6 (left)
8
10
9
(inserted) 1

Becomes:

1
- 7
  - 8
  - 10
- 2
  - 9
  - 6

Array implementation:

Additional diagram:

1 (2 is crossed out, arrow to 2)
- 7
  - 8
  - 10
- 2 (6 is crossed out, arrow to 6)
  - 9
  - 6 (1 is crossed out)

In array form:

1 (2 is crossed out) (left, right)
7 (left, right)
2 (6, 1 are crossed out)
8
10
9
6 (1 is crossed out)

Insert for Array-based Heap

Variables: array A, size
Heap element are in $A[0] \cdots A[\text{size}-1]$

insert(k){
  A[size] <- k; // Add k to the new 'last leaf'
  v <- size
  p <- floor((v-1)/2) // p <- parent(v); percolate_up
  while(v>0 and A[v]<A[p]){
    swap A[v] and A[p]
    v <- p
    p <- floor((v-1)/2)
  }// end of percolate_up
  size <- size + 1;
}

Partially-filled Array Implementation of Binary Heap: Extract-min

Original tree:

2
- 7
  - 8
  - 10
- 6
  - 9
  - 11

Array implemntation:

2 (left, right)
7 (left, right)
6
8
10
9
11

After extract-min, tree:

6
- 7
  - 8
  - 10
- 9
  - 11 (left)

Array implementation:

After another extract-min, tree:

7
- 8
  - 11
  - 10
- 9

Extract_min for Array-based Heap

Code:

extract_min(){
  temp <- A[0] // record value to return
  size <- size-1
  A[0] <- A[size] // move *old* last leaf to root
  i <- 0 // percolate down
  while(2i+1<size){// while i not a leaf
    child <- 2i+1 // the left child of i
    if(2i+2<size and A[2i+2] < A[2i+1]){
      child <- 2i+2 // use the right child if it exists and a smaller key
    }
    if(A[child]<A[i]){ // if order violated,
      swap A[child] and A[i] // swap parent+child
    } else {
      return temp
    }
  } // percolate-down
  return temp.
}

A small space-for-time trade-off in Extract-min

Extract-min does many comparisons, e.g. ($$2i < \text{size}) to check if i is a leaf.
Suppose we ensure the array has $\text{size} \geq 2\times \text{size}$ and there is a big value, denoted $\infty$ , that can be stored in the array but will never be a key. and every array entry that is not a key is $\infty$ .
Then, we can skip the explicit checks for being a leaf.

Extract-min variant

Code:

extract_min(){
  temp <- A[0] // record value to return
  size <- size-1
  A[0] <- A[size] // move *old* last leaf to root
  A[size] <- inf // **
  i <- 0 // percolate down
  while(A[2i+1]+A[i] *or* A[2i+2]+A[i]){ // i has a child that is out of order
    if(A[2i+1]<A[2i+2]){ //if is a left child
      swap A[2i+1] and A[i]
      i <- 2i+1
    } else { //it is a right child
      swap A[2i+2] and A[i]
      i <- 2i+2
    }
  }
  return temp
}

Making a Heap from a Set

Suppose you have n keys and want to make a heap with them.
Clearly can be done in time O(n log n)with n inserts.
Claim: the following alg. does it in time O(n).

make_heap(T){
  //T is a complete b.t. with n keys.
  for(i=floor(n/2)-1 down to 0){
    call percolate_down on node i
  }
}

How does make-heap work?

$\lfloor n/2 \rfloor -1$ is the last internal node
the algorithm does a percolate-down at each internal node, working bottom-up.
- (percolate_down makes a tree into a heap if the only node violating the order properly is the root)

Tree diagram:

0
- 1
  - 3
    - 7
      - 15
      - 16
    - 8
      - 17
      - 18
  - 4
    - 9 (label: last internal node)
      - 19
      - 20
    - 10
- 2
  - 5
    - 11
    - 12
  - 6
    - 13
    - 14

Last internal node equation: $\lfloor \frac{n}{2} \rfloor -1 = \lfloor \frac{21}{2} \rfloor -1 = 9$

Make heap example

10 (0)
- 9 (1)
  - 7 (3)
    - 3
    - 2
  - 6 (4)
    - 1 (left)
- 8 (2)
  - 5
  - 4

Note: $n=10$ ; $\lfloor n/2 \rfloor -1 =4$

Notice: The exact order of visitng nodes does not matter – as long as we visit children before parents. [It follows that it is easy to do a recursive make-heap]

Make heap Example

1 (0; 10 is crossed out)
- 2 (1; 9, 10 are crossed out; checkmark)
  - 3 (3; 10, 2, 7 are crossed out; checkmark))
    - 10 (3 is crossed out; checkmark)
    - 7 (2 is crossed out; checkmark)
  - 6 (4; 6, 1 are crossed out; checkmark)
    - 9 (left; 1,6 are crossed out)
- 4 (2; 8 is crossed out; checkmark)
  - 5
  - 8 (4 is crossed out)

Note: $n=10$ ; $\lfloor n/2 \rfloor -1 = 4$

Notice: The exact order of visitng nodes does not matter – as long as we visit children before parents. [It follows that it is easy to do a recursive make-heap]

Make-heap Complexity

Clearly O(n log n): n percolate-down calls, each O(log n).
How can we see it is actually O(n)?
Intuition: mark a distinct edge for for every possible swap (Time taken is bounded by max. # of swaps possible.)

Diagram of a perfect binary tree with h=5. It is missing the rightmost 4 at h=5. Easier than using a tree.

Time Complexity of Make-heap

Let S(n) be the max number of swaps carried out by make-heap on a set of size n.
We can bound S(n) by:

$S(n) \leq \sum_{d=0}^{h-1} 2^{d} (h-d)$

Explanation:

Part	Note
$\sum_{d=0}^{h-1}$	percolate_down is called, at most on each node at each depth d from 0 to h-1
$2^{d}$	there are $2^{d}$ nodes at depth d
$(h-d)$	The max # of swaps for a call to percolate-down on a node at depth d is h-d

$\begin{aligned} S(n) & \leq \sum_{d=0}^{h-1} 2^{d} (h-d)\\ & = 2^{0} (h-0) + 2^1 (h-1) + \cdots + 2^{h-2} (h(h-2)) + h^{h-1} (h-(h-1)) \end{aligned}$

Set $i=h=d$ , $d=h=i$ and while d ranges over $0,1,\cdots,h-1$ , i will range over $h-0,h-1,\cdots,h-(h-1)$

Now:

$\begin{aligned} S(n) \leq \sum_{i=1}^{h} 2^{h-i} (i) & = \sum_{i=1}^{h} \frac{2^{h}}{2^i} i \leq \sum_{i=1}^{h} \frac{n}{2^i} i\\ & = n \sum_{i=1}^{h} \frac{i}{2^i} \leq n \sum_{i=0}^{h} \frac{i}{2^i} \leq 2n \end{aligned}$

$\Bigg ( \sum_{i=0}^{h} \frac{i}{2^i} = \frac{0}{2^0} + \frac{1}{2^1} + \frac{2}{2^2} + \frac{3}{2^3} + \cdots = \frac{1}{2} + \frac{1}{2} + \frac{3}{8} + \frac{1}{4} + \frac{5}{32} + \cdots \Bigg )$

Everything after $\frac{3}{8}$ is less than or equal to 1.

Complexity of Make-heap

Work done by make-heap is bounded by a constant times the number of swaps so is O(n).

Updating Priorities

Suppose a heap contains an item with priority k, and we execute update_priority(item, j).

We replace k with j in the heap, and then restore the order invariant:

if j < k, do percolate_up from the modified node
if k < j, do percolate_down from the modified node.

Tree 1:

unlabled root
- unlabled child
- unlabled child
  - unlabled grandchild
    - unlabled great-grandchild
      - ...
    - unlabled great-grandchild
      - ...
  - unlabled grandchild

Tree 2:

unlabled root
- unlabled child
- unlabled child
  - unlabled grandchild
    - unlabled great-grandchild
      - unlabled great-great-grandchild
        
        ...
      - unlabled great-great-grandchild
        
        ...
    - unlabled great-grandchild
      - unlabled great-great-grandchild
        
        ...
      - unlabled great-great-grandchild
        
        ...
  - unlabled grandchild

This (restarting ???, can’t read ???) takes O(log n) time – but how do we find the right node to change??
To do this we need an auxiliary data structure.

End (transcriber’s note: not the end)

Correctness of swapping in percolate down

b
- a
  - ...
- c
  - d
    - ...
  - e
    - ...

Suppose we use percolating down c
Then c and b were previously swapped, so we know $b\leq e$ , $b\leq d$ , and $b < c$ .
If $c > e$ and $e \leq d$ , we swap c,e

Now:

b
- a
  - ...
- e
  - d
    - ...
  - c
    - ...

we know $b\leq e \leq c$ and $b\leq e \leq d$
so order is OK, except possible below c–which we will have to look at.

Correctness of swapping in percolate_up

b
- a
  - ...
- c
  - d
    - ...
  - e
    - ...

suppose we are percolating up c
we know $c\leq d, c\leq e$ because we previously swapped c with d or e.
we know that $b\leq a$
if $c< b$ , we swap c,b

Now:

c
- a
  - ...
- b
  - d
    - ...
  - e
    - ... (T3)

we know that $c < b \leq e$ and $c < b \leq d$ and $c < b \leq a$
So order is OK, except possibly with ancestors of c, which we still must check.