Priority Queues
- Defination: A data structure implementing a set S of elements, each associated with a key.
- Every element has its own key. And the key represents the priority of that element. Keys are not necessarily unique.
- It supports the following operations:
insert(S,x)
: insert element x into set Smax(S)
: return element of S with largest keyextract_max(S)
: return element of S with largest key and remove it from Sincrease_key(S,x,k)
: increase the value of element x’ s key to new value k (assumed to be as large as current value)
Heap
- Heap is an implementation of a priority queue
- is also an array visualized as a nearly complete binary tree
- Heaps can be convert to max heap or min heap when sorting them.
- Max heap: the key of a node is ≥ than the keys of its children. (analogously as min heap)
- Heap as a tree:
- Root of the tree: the first element (i=1) in the array
- parent(i)=i/2: returns index of node’s parent
- left(i)=2i: returns index of node’s left child
- right(i)=2i+1: returns index of node’s right child
- The height of a binary heap is $O(lg n)$
- Heap operations:
build_max_heap
: produce a max-heap from an unordered arraymax_heapify
: correct a single violation of the heap property in a subtree at its rootinsert
,extract_max
,heapsort
Max_heapify:
- the assumption is that the trees rooted at left(i) and right(i) are max-heaps.
- If element A[i] violates the max-heap property, correct violation by “trickling” element A[i] down the tree, making the subtree rooted at index i a max-heap.
- Pseudocode:
1
2
3
4
5
6
7
8
9l = left(i)
r = right(i)
if (l <= heap-size(A) and A[l] > A[i])
then largest = l else largest = i
if (r <= heap-size(A) and A[r] > A[largest])
then largest = r
if largest != i
then exchange A[i] and A[largest]
Max_Heapify(A, largest)
Build_Max_Heap:
- Converts A[1…n] to a max heap
Build_Max_Heap(A):
1
2for i=n/2 downto 1
do Max_Heapify(A, i)The reason to start at n/2 is that element A[n/2+1 … n] are all leaves. There is no need to build max_heap for leaves which has only one element (its own).
- Time costs: $O(nlog n)$ via simple analysis
- Observe however that Max_Heapify takes $O(1)$ for time for nodes that are one level above the leaves(叶子层的上一层,即倒数第二层),$O(l)$ for the nodes that are $l$ levels above the leaves. There are n/4 nodes in level 1(即 倒数第二层), n/8 in level 2 (再往上的的一层) and so on till in $lg n$ level remains one root node.
- Therefore, the total amount of work in the for loop is (括号里面的是那一层的Node个数):
$$\frac n4(1c) + \frac n8 (2c) + \frac n{16} (3c) + … + 1(lgn c)$$ - set $\frac n4=2^k$简化上式可得:$c 2^k (\frac1{2^0}+\frac2{2^1}+\frac3{2^2}+ … + \frac{(k+1)}{2^k})$, 括号里面上界是个常数
- Therefore, Build_max_heap is $O(n)$
Heap sort
- Sort strategy:
- Build Max Heap from unordered array;
- Find maximum element A[1];
- Swap elements A[n] and A[1]: now max element is at the end of the array!
- Discard node n from heap (by decrementing heap-size variable)
- New root may violate max heap property, but its children are max heaps. Run max_heapify to fix this.
- Go to Step 2 unless heap is empty
- Running time:
- after n iterations the Heap is empty
- every iteration involves a swap and a max_heapify operation; hence it takes O(log n) time
- Overall $O(nlog n)$
- A nice demo here: https://www.geeksforgeeks.org/heap-sort/