Sorting

re-arranging elements of a sequence $S \text{ s.t. } S_0 \leq S_1 \leq S_2 \leq \cdots \leq S_{n-1}$
We will look at 5 sorting algorithms:
- 3 iterative
- 2 recursive

The iterative algorithms

maintain a partition: “unsorted part” & “srtoed part”
sort a sequence of n elements in n-1 stages
at each stage, move 1 element from the unsorted part to the sorted part:
- (Diagram of a generic array with unseen “sorted” items on the left and “unsorted” elements on the right. Caption: “1 stage moves 1 element”)

sort(A){
  * initialize 
  * repeat n-1 times
    * move 1 element from unsorted and sorted part
}

The algorithms differ in how they:
- select an element to remove from the unsorted part
- insert it into the sorted part

Insertion Sort

Initially sorted part is just A[0]
Repeat n-1 times
- remove the first element from the unsorted part
- insert it into the sorted part (shifting elements to the right as needed)

Diagram of array as it gets sorted in three stages:

Stage 1: sorted is leftmost (0th) element; n-1 elements are unsorted on the right.
Stage 2: approximately half of the array is sorted; an arrow points from the leftmost value inside the unsorted side to an arbitrary position inside the sorted side.
Stage 3: just over half of the array is sorted now.

Code:

insertion sort(A){
  for(i=1 to n-1){
    pivot = A[i] // first element in unsorted part
    j=i-1
    // The following loop shifts all elements in sorted parts that are larger than pivot 1 "to the right"
    while(j>=0 AND A[i] > pivot){
      A[j+i] = A[j] // shift jth
      j = j-1
    }
    A[j+i] = pivot // move pivot into position.
  }
}

Insertion Sort Example

Stages:

Stage 0: Original
- 5 4 2 6 1 3
Stage 1: (label: 4)
- 4 5 2 6 1 3
Stage 2: (label: 2)
- 2 4 5 6 1 3
Stage 3: (label: 6)
- 2 4 5 6 1 3
Stage 4: (label: 1)
- 1 2 4 5 6 3
Stage 5: (label: 3)
- 1 2 3 4 5 6

Selection Sort

initially sorted part is empty
repeat n-1 times
- find the smallest element in the unsorted part
- make it the first position which becomes the now last position of sorted part.

Diagram of parts:

Initially, the entire array is all unsorted.
Over time the sorted elements stack up on the left.
Every time an element is moved, it is moved from the unsorted part (lowest element) and swapped with the element just after the end of the sorted part, making the sorted part one element bigger.
Eventually all elements are sorted in descending order.

Code:

selection_sort(A){
  for(i=1 to n-1){
    // find min element of unsorted
    j=i-1 // j is index of min found so far.
    k=i
    while(k<n){
      if(A[k]<A[j]) j=k;
      k=k+1
    }
    swap A[i-1] and A[j]
  }
}

Process of Selection Sort:

Original: all unsorted
- 5 4 2 6 1 3
Stage 1: [0] is sorted; 1 and 5 swap
- 1 4 2 6 5 3
Stage 2: [0..1] is sorted; 2 and 4 swap
- 1 2 4 6 5 3
Stage 3: [0..2] is sorted; 3 and 4 swap
- 1 2 3 6 5 4
Stage 4: [0..3] is sorted; 4 and 6 swap
- 1 2 3 4 5 6
Stage 5: [0..4] is sorted; annotation: s.t. s (final stage)
- 1 2 3 4 5 6

Heapsort (Selection Sort is crossed out)

Initially sorted part empty
(highlighted) make unsorted part into a heap
repeat n-1 times
- find the smallest element in the unsorted part (Note: heap extract takes log(n) time vs. Θ(n) for the scan in selection sort)
- move it to the first position which becomes the new last position of the started part.

Consider the organization of array contents:

(Diagram of array with sorted half on the right and the unsorted half on the left.) A purple arrow points to the leftmost element in the unsorted portion. The note reads: “if this is the root of the heap, then it is also the smallest element in the unsorted part, so is in its correct final position. To use this arrangement, the root of the heap keeps moving, so we have lots of shifting to do.”
(A diagram showing the same array with sorted and unsorted halves.) A purple arrow points to the last element in the array; it points to a purple circle. A purple square is at the leftmost element of the unsorted half (the one discussed in the last item). The note reads: “If this is the root of the, then everything works:
- We extract the final element (purple circle); move the last leaf (purple square) to the root + do a percolate-down; store the final element (purple circle) where the last element of the unsorted list (purple square) was, which is now free, and is the correct final location for the previously final element (purple circle); after which we have:
- (Diagram of array with the “sorted” half extended one cell over to encompass the purple circle) * But: we must re-code our heap implementation s.t. the root is at A[n-1], with the result that the indexing is now less intuitive.
Instead, we use a max-heap, and this arrangement:
- (Diagram showcasing, as previously, a sorted half to the right and an unsorted half on the left. An orange circle labeled “root of heap” is the very first element of the list and the unsorted half; an orange square labeled “last leaf” sits at the end (rightmost side) of the unsorted half.)
- The heap root is at A[0]
- Heap Extraction remove the root of the heap (orange circle), moves the last leaf (orange square) to A[0], freeing up the spot where the root of the heap (orange circle) belongs.
- This leaves us with: (Diagram of the orange circle near the middle of the array, at the leftmost portion of the sorted half. The orange square is in the center of the unsorted half.)
- Re-coding a min heap into a max heap is just replacing < with > and vice versa.

Heapsort (Selectioon Sort is crossed out)

initially sorted part empty
(highlighted) make unsorted part into a max heap
repeat n-1 times:
- find the largest (smallest is crossed out) element in the unsorted part
- move it to the last (first is crossed out) position which becomes the new first (last is crossed out) position of the sorted part.

Code:

heapsort(A){
  buildMaxHeap(A)
  for(i=1 to n-1){
    A[n-1] extractMax()
  }
}

Stages of sorting:

(Diagram of unsorted array with first element labeled as “heap with max here”.)
(Diagram of a half-sorted array showing the swap between the first and last elements of the unsorted portion of the array. Labeled as “take max element from root…” and “take last leaf from end of heap” with arrows pointing to one another.)
(Diagram of a half+1 sorted array, displaying the new sorted element that has been swapped from the root element of the heap. Labeled as “newest element of sorted part” and “this is the final location” (the new element just swapped), and “y new root of heap (which then gets percolated down)” (what is now the first element of the array, which was also just swapped).)

Unsorted heap of size 1 has smallest element.

Heapsort with in-line percolate-down

Code:

heapsort(A){
  makeMaxHeap(A)
  for(i=1 to n-1){
    swap A[0] and A[n-1] // move last leaf to root and old root to where last leaf was
    size <- n-i+1 // size of heap = size of unsorted part
    // start of percolate down
    j <- 0
    while(2j+1 < size){
      child <- 2j+1
      if(2j+2 < size AND A[2j+2] < A[2j+1]){
        child <- 2j+2
      }
      if(A[child]<A[j]){
        swap A[child] and A[j]
        j <- child
      } else {
        j <- size // termite the while
      }
    } // end of percolate down
  }
}

Heapsort Example

Original:
- 5 4 2 6 1 3
Turn into heap:
- 6 5 3 4 1 2
Swap root (6) and last unsorted element (2):
- 2 5 3 4 1 6
Re-heap the unsorted portion: [0..4]
- 5 4 3 2 1 6
Swap root (5) and the last unsorted element (2):
- 1 4 3 2 5 6
Re-heap the unsorted portion: [0..3]
- 4 2 3 1 5 6
Swap root (4) and the last unsorted element (1):
- 1 2 3 4 5 6
Re-heap unsorted portion: [0..2]
- 3 2 1 4 5 6
Swap root (3) and last unsorted element (1):
- 1 2 3 4 5 6
Re-heap unsorted portion: [0..1]
- 2 1 3 4 5 6
Swap root (2) and last unsorted element (1):
- 1 2 3 4 5 6
Array is sorted because unsorted portion is only 1 element.

Tree version of above (heap):

Original:

5
- 4
  - 6
  - 1
- 2
  - 3 (left)

After re-heap and one removal:

2
- 5
  - 2
  - 1
- 3

After a second re-heap and removal:

1
- 4
  - 2 (left)
- 3

After a third:

1
- 2
- 3

Examples stop here.

Heapsort Example (2)

(Repeat same as above, except with different trees.)

Trees (Transcriber’s note: these trees don’t seem relavant to me…. but maybe I’m wrong):

2 (crossed out with orange 5)
- 5 (crossed out next to orange 2 which is also crossed out; an orange 4 is not crossed out)
  - 4 (crossed out with orange 2)
  - 1
- 3

1 (crossed out with an orange 4)
- 4 (crossed out with orange 1, which is also crossed out; an orange 2, not crossed out is next to it)
  - 2 (left; crossed out with orange 1)
- 3

1 (orange 2)
- 2 (left; orange 1)

Time Complexity of Iterative Sorting Algorithms

each algorithm does exactly n-1 stages
the work done at the i^th stage varies with the algorithm (& input).
we take # from item comparisons as a measure of work/time*.

Selection Sort: exactly n-i comparisons to find num element in unsorted part
Insertion Sort: between 1 and i comparisons to find location for pivot
HeapSort: between 1 and $2\log_{2} (n-i-1)$ comparisons for percolate-down

* Number of comparisons

We must verify # comparisons (or some constant times # comparisons) is an upper bound on work done by each algorithm.
of assignments (& swaps) also matters in actual run time.

Selection Sort

On input of size n, # of comparisons is always (regardless of input):

$\begin{aligned} \sum_{i=1}^{n-1} (n-i) & = \sum_{i=1}^{n-1} i\\ & = S(n-i)\\ & = \frac{(n-1)(n)}{2}\\ & = \frac{n^2 -n}{2}\\ & = \Theta(n^2) \end{aligned}$

Insertion Sort – Worst Case

Upper Bound: $\text{\# comparisons} \leq \sum_{i=1}^{n-1} i = \frac{n^{2} -n}{2} = O(n^{2})$

Lower Bound:

Worst case initial sequence is in reverse order. e.g.:
- n n-1 n-2 … 1

In the i^th stage we have:

n-i+1

n-1+2

…

n-1

n-1-1

…

n-i

n-i+1

…

n-1

n-i-1

…

This takes i comparisons, because the sorted part is of size i.
So, $\text{\# comparisons} \leq \sum{i=1}^{n-1} = \Omega(n^{2})$

So, insertion sort worst case is $\Theta(n^{2})$

(Transcriber’s note: I’m fairly certain you can only use big-O notation when talking about worst case scenario, not Theta. But I’m leaving it as written.)

Insertion Sort Best Case

Best case: initial sequence is fully ordered.

Then: In each stage, exactly 1 comparison is made.

So, $\text{\# comparisons} = n-1 = \Theta(n)$ .

Heapsort Worst Case

Upper bound:

$\begin{aligned} \text{\# comparisons} & \leq \sum_{i=1}^{n-1} 2\log_{2} (n-i+1)\\ & = 2\sum_{i=1}^{n-1} \log_{2} (i+1)\\ & = \leq 2\sum_{i=1}^{n-1} \log_{2} n\\ & = \leq 2n\log_{2} n\\ & = O(n \log n) \end{aligned}$

Lower Bound? (empty space)

Base Case? (What input would lead to no movement during percolate-down? What if we exclude this case?)

Recursive Divide & Conquer Sorting

Partition the sequence A into two parts $A_{1}$ , $A_{2}$
Recursively sort each of $A_{1}$ and $A_{2}$
Combine the sorted versions of $A_{1}$ and $A_{2}$ to obtain a sorted version of A.

Diagram showing A: (Back to second occurrence of the diagram)

Original
- 20 … 30 … 1 … 6
Partition
- Two separate arrays with no items shown.
Combine
- 1 … 6 … 20 … 30

The algorithms differ in how they choose the partition, and how they combine the sorted parts.

Mergesort

Uses the fact that merging two sorted lists is easy.

Exmaple 1:

Two lists:

2 (crossed out)

5 (crossed out)

6 (crossed out)

…

1 (crossed out)

3 (crossed out)

4 (crossed out)

…

Combined:

…

Example 2:

Two lists:

… 90 (crossed out) 91 (crossed out)

…

80 (crossed out)

100

101

102

103

Combined:

…

100

101

102

103

Takes O(n) time, where n is the total size.

Mergesort

Partition: first half & second half.
Combine: merge the parts

Diagram showing the sorting of a list:

Original, length: n

Two parts, length $\lfloor n/2 \rfloor$ $⌊ n /2 ⌋$ and $\lceil n/2 \rceil$ $⌈ n /2 ⌉$ respectively. All numbers are crossed out. Annotation: “recursively sort n/2 elements”
- 3 4 5 7 9
- 1 2 6 8 10

Annotation: merge two parts.

works with linked-list or array implementations
in array implementations, used $\Theta(n)$ extra space.

Mergesort

Code:

mergesort(A,lo,hi){
  if(lo<hi){// there are >=2 items, so work to do
    mid <- floor((lo+hi)/2)
    mergesort(A,lo,mid)
    mergesort(A,mid+1,hi)
    merge(A,low,mid,hi)
  }
}

Merge for Merge sort

Code:

merge(A,lo,mid,hi){
  l <- lo
  r <- mid+1
  n <- lo
  while(l<mid AND r<hi){
    if(A[l]<A[r]){
      B[n] <- A[i]
      l++
    } else {
      B[n] <- A[r]
      r++
    }
  }
  while(l<mid){
    B[n] <- A[l]
    l++;n++
  }
  while(r<hi){
    B[n] <- A[r]
    r++;n++
  }
}// *

After *, the sorted sequence is in B[lo]…B[hi].

Lazy solution:
- copy B[lo]..B[hi] to A.
Fast solution:
- swap A,B, as in:
  - temp <- A
  - A <- B
  - B <- temp

Time Complexity of MergeSort: via tree of recursive calls

n
- n/2
  - n/4
    - n/8
      - ...
        
        2
        
        1
        
        1
    - n/8
      - ...
  - n/4
    - n/8
      - ...
    - n/8
      - ...
- n/2
  - n/4
    - n/8
      - ...
    - n/8
      - ...
  - n/4
    - n/8
      - ...
    - n/8
      - ...

Each level takes O(n), so total time is $\log_{2} n$

If n is a power of 2, the tree of recursive calls is a perfect binary tree with n leaves and height $\log_{2} n$ .
At depth i there are $2^i$ calls to merge, each to merge two lists of size $\frac{n}{2^{i+1}}$ into one of size $\frac{n}{2^i}$ .
Total work at depth i is $2^{i}\cdot O(\frac{n}{2^i}) = O(n\frac{2^i}{2^i}) = O(n)$ .
Total work is $\text{\# depths}\cdot O(n) = \log n \cdot O(n) = O(n \log n)$ .

Recursive Divide & Conquer Sorting

Partition the sequence A into two parts $A_{1}$ , $A_{2}$
Recursively sort each of $A_{1}$ and $A_{2}$
Combine the sorted versions of $A_{1}$ and $A_{2}$ to obtain a sorted version of A.
Diagram from earlier
The algorithms differ in how they choose the partition, and how they combine the sorted parts.

Quicksort

Uses a pivot p to partition sequence into “small” and “large” elements: small elements < p < large elements
Combining sorted versions is trivial

Diagram:

list of length n
now split into “partitions”; annotation: “choose a pivot p”
- values <= p p values > p
recursively sort two parts
Now:
- values <= p in order p values > p in order
Choosing pivots is key to performance.

Quicksort

Code:

quicksort(A,lo,hi){
  if(lo<hi){// there are >= 2 items
    pivot position <- partition(A,lo,hi)// partition
    quicksort(A,lo,pivot position-1)
    quicksort(A,pivot position+1,hi)
  }
}

Quicksort is correct as long as every call to partition() returns and leaves the variables satisfying the following:

lo <= pivot position <= hi
for every i,j with $\text{lo} \leq i \leq \text{pivot position} \leq j \leq \text{hi}; A[i] \leq A[\text{pivot position}] \leq A[\text{hi}]$

However efficiency relies critically on choice of pivot.

Ex: Perfect Pivots

Suppose all elements are distinct, and the pivot is chosen to be the median element in A[lo]…A[hi].
Then, every call to Quicksort on sequence of size $k \geq 2$ $k \geq 2$ makes two recursive calls on sequences of $\text{size} \leq k/2$ $size \leq k /2$ : (Diagram shows partition labels):
- (length) $\leq k/2$ partition (length) $\leq k/2$
By essentially the same augment as used for MergeSort, this gives us running time of $\log(n)\cdot f(n)$ , where f(n) is the time to run partition on a sequence of size n.
Assuming O(n) time for partition, this would give us $O(n\log n)$ time for Quicksort.
But: finding medians is too slow in practice
Optional exercise: Can the median be found in O(n) time?

Ex: Worst Case Path

Suppose all elements are distinct, and the max is always chosen as pivot.
Then every call to Quicksort on a sequence of $k\geq 2$ $k \geq 2$ elements makes one recursive call on a sequence of size $k-1$ $k - 1$ , and one on a sequence of size 0: (diagram of length of array partitions):
- k-1 max
The recursion tree looks like this:

n
- n-1
  - n-2
    - n-3
      - ...
        
        1
        
        0
    - 0
  - 0
- 0

This tree is of height $\Theta(n)$ , giving us a running time of $\Theta(n)\cdot f(n)$ , or $\Theta(n^{2})$ assuming $\Theta(n)$ for partition.

Quicksort takes time $\Theta(n^{2})$ in the worst case.

Partition

Partition must choose a pivot p and efficiently re-arrange elements

Code:

partition(A,lo,hi){
  pivotindex <- choosePivot(A,lo,hi) // choose pivot
  swap A[pivotindex] and A[hi] // move pivot out of the way
  p <- A[hi] // p is the pivot
  i <- lo // known "small" values will be at nodes < i
  for(j=lo;j<hi;j++){ // "already inspected" values will be at incicies < j
    if(A[j]<=p){ // if we are inspecting a "small"
      swap A[i] and A[j] // swap it with first "non small"
      i <- i+1 // increase size of "smalls" part
    }
  }
  swap A[i] and A[hi] // move pivot where it belongs
  return i // this is pivot position
}

Name	Description
lo	known to be small
…	…
i	known to be large
…	…
j	not inspected yet; j itself is currently being inspected
…	…
hi	pivot

Partition Example

(Start of a diagram of how a quick sort happens over time.)

Legend:

known small

known large

The modifier will be displayed after the value to not be confused with negative and positive symbols.

If a variable (i.e. i, j) are set to an index of the array, it will show up in parenthesis after the value.

NOTE: Every table below has unseen elements to the left and right like so:

…

array values

…

However, because all the action is happening inside the visible part of the array, the left and right ellipses will not be shown below.

Original:

5 (lo)

4 (pivot)

7 (hi)

swap pivot into hi
- 5 (i,j) 3 7 8 2 9 1 4
$A[j] \nless 4$
- 5+ (i) 3 (j) 7 8 2 9 1 4
$A[j] < 4$
- 3- 5+ (i) 7 (j) 8 2 9 1 4
$A[j] \nless 4$
- 3- 5+ (i) 7+ 8 (j) 2 9 1 4
$A[j] \nless 4$
- 3- 5+ (i) 7+ 8+ 2 (j) 9 1 4
$A[j] < 4$
- 3- 2- 7+ (i) 8+ 5+ 9 (j) 1 4
$A[j] \nless 4$
- 3- 2- 7+ (i) 8+ 5+ 9+ 1 (j) 4
$A[j] < 4$
- 3- 2- 1- 8+ (i) 5+ 9+ 7+ 4 (j)
swap A[i] & A[hi] (small is first 3, pivot is 4, large is last 4)
- 3- 2- 1- 4 (i) 5+ 9+ 7+ 8+ (j)

(End of diagram)

Partition

Time complexity of partition is $\Theta(n)+g(n)$ , where g(n) is time taken by choosePivot.
Q: How can we choose “good” pivots “fast”?
Quicksort is the most-used sorting algorithm in practice, so there must be a way.
But, what qualifies as “fast”? Probably a very small constant.
What qualifies as “good”? (given that it must be fast)

Consider

A small number of bad pivots makes a small difference in height.
(A digram explaining the above. First it shows a perfect binary tree of height 5, next to a binary tree with height 6, where at depth 2, the “right right (RR)” subtree is extended with its own perfect binary tree of height 4, whereas the “right left (RL)” subtree contains nothing.)
Perfect pivot points are not needed for $O(n\log n)$ time.
(Diagram showing the progressive shortening of the list into smaller and smaller pieces. There are no number, markings or annotations.)
If pivots are all better than $\frac{1}{3}\cdot \frac{2}{3}$ , the depth is $\log 3/2 n$ , so we still get $O(n\log n)$ .

Some “simple” choosePivot options

A[hi] - fast, but performs badly on many inputs.
median - perfect pivots, but too slow to compute.
random - If pivots are chosen uniformly at random, then Quicksort runs in time $O(n\log n)$ with probability $1-\frac{1}{2^{n}}$ – i.e. almost always. But: good random numbers are not fast to make.
median{A[lo],A[hi],A[(hi+lo)/2]}
- fast
- not very easy to come up with a “very bad” input.

Complexity of Quicksort

Depends critically on how pivots are chosen.
Choosing “perfect” pivots is too slow for a practical sorting algorithm.
Fortunately, choosing pivots that are “good enough” for most inputs can be done fast.
Quicksort – with practical pivot choice strategies – is $\Theta(n^{2})$ , but is often observed as “like $O(n\log n)$ in practice.

In practice

There are settings where Merge sort & Insertion sort are preferred.
In most settings, the preferred algorithm is Quicksort.
For small sets, SelectionSort is faster.
Often, this variant (or similar) is faster:

Code:

quicksort(A,lo,hi){
  if(lo<hi){// there are >=2 items
    if(lo+15>hi){// less than 15 items
      selectionSort(A,lo,hi)
    }
    else {
      pivotposition <- partition(A,lo,hi) // partition
      quicksort(A,lo,pivotposition-1)
      quicksort(A,pivotposition+1,hi)
    }
  }
}

Sorting

Sorting

The iterative algorithms

Insertion Sort

Insertion Sort Example

Selection Sort

Heapsort (Selection Sort is crossed out)

Heapsort (Selectioon Sort is crossed out)

Heapsort with in-line percolate-down

Heapsort Example

Heapsort Example (2)

Time Complexity of Iterative Sorting Algorithms

* Number of comparisons

of assignments (& swaps) also matters in actual run time.

Selection Sort

Insertion Sort – Worst Case

Insertion Sort Best Case

Heapsort Worst Case

Recursive Divide & Conquer Sorting

Mergesort

Mergesort

Mergesort

Merge for Merge sort

Time Complexity of MergeSort: via tree of recursive calls

Recursive Divide & Conquer Sorting

Quicksort

Quicksort

Ex: Perfect Pivots

Ex: Worst Case Path

Partition

Partition Example

Partition

Consider

Some “simple” choosePivot options

Complexity of Quicksort

In practice

End