Need to digest again!
First lists some claims:
- searching among n preprocessed items requires $Ω(lg n)$ time: binary search, AVL tree search optimal
- sorting n items requires $Ω(nlg n)$: mergesort, heap sort, AVL sort optimal
Comparison model
- All the input items are black boxes (ADTs)
- Only support comparisons (>,<, etc.)
- Time cost = number of comparisons
Decision tree
Any comparison algorithm can be viewed/specified as a tree of all possible comparison outcomes & resulting output.
- internal node = binary decision
- leaf = output (algorithm is done)
- root-to-leaf path = algorithm execution
- path length (depth) = running time
- height of tree = worst-case running time
In fact, binary decision tree model is more powerful than comparison model, and lower bounds extend to it
A sorting algorithm is comparison based if it uses comparison operators to find the order between two numbers. Comparison sorts can be viewed abstractly in terms of decision trees. A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size. The execution of the sorting algorithm corresponds to tracing a path from the root of the decision tree to a leaf. At each internal node, a comparison ai <= aj is made. The left subtree then dictates subsequent comparisons for ai <= aj, and the right subtree dictates subsequent comparisons for ai > aj. When we come to a leaf, the sorting algorithm has established the ordering.
Lower Bounds
Search Lower Bound
- number of leaves >= number of possible answers >= n
- decision tree is binary
- height >= $lg Θ(n) = lg n ± Θ(1)$
Sorting Lower Bound
- leaf specifies answer as permutation: A[3] ≤ A[1] ≤ A[9] ≤ . . .
- all n! are possible answers
- number of leaves >= n!
$$ \begin{align*}
height & ≥ lg n! \\
& = lg(1·2 /cdots (n − 1)·n) \\
& = lg 1 + lg 2 + /cdots + lg(n − 1) + lg n \\
& = \sum^n_{i=1} lg i \\
& \ge \sum^n_{i=n/2} lg i \\
& \ge \sum^n_{i=n/2} lg \frac n2 \\
& = \frac n2 lg n - \frac n2 \\
& = \Omega (n lg n)
\end{align*}$$
Therefore, the lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is $Ω(nLogn)$, i.e., they cannot do better than $n logn$.
Linear-time Sorting
Radix sort
Counting sort is a linear time sorting algorithm that sort in $O(n+k)$ time when elements are in range from 1 to k. But if the elements are in range from 1 to $n^2$, counting sort will take $O(n^2)$.
How to sort such an array in linear time? Use Radix sort.
The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
Reference:
https://www.geeksforgeeks.org/lower-bound-on-comparison-based-sorting-algorithms/
https://www.geeksforgeeks.org/radix-sort/