Data Structures and Algorithms (H) Lecture 4 Notes
HeapSort
Introduction
The idea behind HeapSort is to find the largest element and move it to the end of the array. Then we repeat this process with remaining elements until the array is sorted. This idea is similar to the SelectionSort, but SelectionSort compares lots of elements to find the largest, while the HeapSort store knowledge gained from these comparisons for future use. This make the HeapSort very efficient on both runtime and storage, and it also sorts arrays in place. More specifically, we use a clever data structure, a heap to achieve this.
Heap
A heap is essentially an array imagined as being a
binary tree. The elements are arranged row by row from left to
right. For example, the array
We
navigate through the array/imaginary tree using these operations:
Procedures Needed for the Algorithm
Build-Max-Heap: produces a max-heap from an unordered array.Max-Heapify: maintains the max-heap property once the maximum (the root element) has been removed.HeapSort: sorts an array in place.
We also need to introduce a new variable A.heap-size
which indicates how many elements of A.heap-size by 1 effectively removes the last element from
the heap. There are analogous operations for min-heaps.
Max-Heapify(A,i)
First we assumes subtrees Max-Heapify the subtree at 
Runtime
To figure out the runtime of Max-Heapify, we need to
first define the height of the node in a tree.
Definition: Height of Tree Nodes
The height of a tree node is the longest number of simple downward edges from the node to a leaf.
We know that Max-Heapfity takes constant time Max-Heapify on a node of height
Correctness
We prove the correctness by induction on the height: - Base
case: height is 0, which means we are on a leaf, this is trivial. -
Inductive case: assume the algorithm is work for height
Build-Max-Heap(A,n)
The idea is to use Max-Heapify repeatedly to create a
heap. Since Max-Heapify assumes
Tip
Nodes in
A example run of this algorithm on the following
initial array: 
Correctness
We prove the correctness of Build-Max-Heap by using loop
invariant. - Loop invariant: At the start of each iteration
Max-Heapify(A,i) turns the subtree at
Runtime
Since all nodes have height at most Max-Heapify takes time Build-Max-Heap calls Max-Heapify
Refined Runtime Analysis
The key observation is that most nodes in the tree have small height.
We can show that there are at most
A better bound for the Build-Max-Heap can then be
calculated as
HeapSort
With all the procedures introduces above, we can now describe the
idea of the HeapSort as follows: - Build a max-heap, s.t.
the root contains largest element. - Swap the root with the last element
of the heap/array. - Discard the last element from the heap by reducing
the size of the heap by 1. - Call Max-Heapify(A,1) to
restore heap property at the root. The pseudocode goes as follows:
The runtime will then be
Correctness of HeapSort
- Loop invariant: At the start of each iteration of the for
loop of line 2-5, the subarray
is a max-heap containing the smallest elements of , and the subarray contains the largest elements of and is sorted. - Initialization: The subarray
is empty, the invariant trivially holds. - Maintenance:
is the largest element in and it is smaller than the elements in . When we put it in the -th position, then contains the largest elements, sorted. Decreasing the heap size and calling Max-Heapifyturnsinto a max-heap. Decrementing sets up the invariant for the next iteration. - Termination: After the loop
. This means that is sorted and is the smallest element in the array, which makes the array sorted.
- Title: Data Structures and Algorithms (H) Lecture 4 Notes
- Author: Zhige Chen
- Created at : 2025-10-01 14:47:31
- Updated at : 2025-10-10 13:16:23
- Link: https://nofe1248.github.io/2025/10/01/dsaa-lec-4/
- License: This work is licensed under CC BY-NC-SA 4.0.