QuickSort

Introduction

The quick sort is also a divide-and-conquer algorithm just like the merge sort:

Divide:
- Pick some element called pivot.
- Move it to its final location in the sorted sequence s.t. all smaller elements are to its left, larger ones are to its right.
Conquer:
- Recursively sort subarrays for smaller and larger elements.
Combine:
- No work needed here since after the recursion the array is sorted.

QuickSort: The Algorithm

Initially we will cal QuickSort(A,1,A.length).

The differences to merge sort are:

We split the array at , the position of the pivot in sorted array, and we don’t know in advance, instead it is revealed by Partition.
There is no combine step at the end.
Partition plays a similar role to Merge.

`Partition(A,p,r)`

The Partition function rearranges the subarray in place using swaps, and it takes the last element as the pivot element.

The main ideas are:

Scan the subarray from left to right.
Build up a subarray of elements smaller or equal to the pivot.
Build up a subarray of elements larger than the pivot.
When reaching the end of the array, put the pivot in the right place.

The pseudocode for Partition goes as follows:

An example run of the Partition algorithm:

Correctness

We prove the correctness of Partition by using the loop invariant:

Loop invariant: At the beginning of the -th iteration, we have (let denotes the pivot) and .
Initialization: Since at the initialization and , the invariant trivially holds.
Maintanence:
- If the if on line 4 is false, then is greater than , and the algorithm will increate the to so that subarry grows one element to contain .
- If the if on line 4 is true, then we swap with and then increase both and by 1.
- This maintains the invariant.
Termination: After the last swap in line 7, and Partition returns the position of .

Runtime

The runtime is straightforward to find: where .

Worst-case and Best-case Partitionings

The overall runtime depends on how the array is partitioned as that determines the sizes and of the subarray to be sorted recursively.

Worst-case Partitioning

The worst case is attained when Partition always produces one subproblem with and one with elements. This case can appear, e.g., when the array is sorted. This leads to the following recurrence: Solving this gives

Best-case Partitioning

The best case is where we split the problem into two subproblems of sizes and . By ignoring floors, ceilings, and we get the recurrence: From the analysis of merge sort, the solution will be

Towards an Average Case

To reach the average case, we assume that all elements of the array are distinct, and each split was equally likely. This situation occurs when the input is chosen uniformly at random amongst all possible orderings. Then This is an average over all problem sizes for 2 subproblems plus . The solution of this recurrence is To prove this we use the substitution method:

Proof. We use induction on to prove

Base case: . And we need to prove that :

This trivially holds. (e.g., for )
Inductive case: Assume fo every we have and prove for that The highlighted part is based on fact that when a summation has the form , where is a monotonically increasing function, then we can approximate it by integrals: So we have Q.E.D.

Improvements to QuickSort

Quick sort is often fast in practice because of the small constants in the asymptotic runtime. There are some improvements (not comprehensive) we can apply on the quick sort:

Improvements for handling equal values:
- Partition into smaller, equal, and larger elements.
- So we only need to sort smaller and larger subarrays.
Choose the pivot as median of 3 elements.
- Slightly faster in practice, but still has a quadratic worst-case runtime.
Dual-Pivot Quick Sort by Vladimir Yaroslavskiy.
- Use two pivots instead of one and partition array into 3 areas.

A Randomized Version of QuickSort

Choosing the right pivot element can be tricky since we have no idea a priori which element can be a good pivot. A (surprisingly good) solution is to leave it to chance:

Data Structures and Algorithms (H) Lecture 5 Notes