The version of PARTITION given in this chapter is not the original partitioning algorithm. Here is the original partition algorithm, which is due to C.A.R. Hoare:
@{
\proc{Hoare-Partition}(A, p, r)
\li x = A[p]
\li i = p - 1
\li j = r + 1
\li \While \textbf{true}
\li \;\;\;\;\textbf{repeat}\;j = j - 1
\li \;\;\;\;\textbf{until}\;A[j] \leq x
\li \;\;\;\;\textbf{repeat}\;i = i + 1
\li \;\;\;\;\textbf{until}\;A[i] \geq x
\li \;\;\;\;\If i < j
\li \;\;\;\;\;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \;\;\;\;\Else
\li \;\;\;\;\;\;\;\;\Return j
}@
\proc{Hoare-Partition}(A, p, r)
\li x = A[p]
\li i = p - 1
\li j = r + 1
\li \While \textbf{true}
\li \;\;\;\;\textbf{repeat}\;j = j - 1
\li \;\;\;\;\textbf{until}\;A[j] \leq x
\li \;\;\;\;\textbf{repeat}\;i = i + 1
\li \;\;\;\;\textbf{until}\;A[i] \geq x
\li \;\;\;\;\If i < j
\li \;\;\;\;\;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \;\;\;\;\Else
\li \;\;\;\;\;\;\;\;\Return j
}@
a. Demonstrate the operation of HOARE-PARTITION on the array A = {13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21}, showing the values of the array and auxiliary values after each iteration of the for loop in lines 4-11.
The next three questions ask you to give a careful argument that the procedure HOARE-PARTITION is correct. Prove the following:
b. The indices i and j are such that we never access an element of A outside the subarray $A[p .. r]$.
c. When HOARE-PARTITION terminates, it returns a value j such that $p \leq j < r$.
d. Every element of A[p .. j] is less than or equal to every element of A[j+1 .. r] when HOARE-PARTITION terminates.
The PARTITION procedure in section 7.1 seperates the pivot value (originally in A[r]) from the two partitions it forms. The HOARE-PARTITION procedure, on the other hand, always places the pivot value (originally in A[p]) into one of the two partitions A[p .. j] and A[j+1 .. r]. Since $p \leq j < r$, this split is always nontrivial.
e. Rewrite the QUICKSORT procedure to use HOARE-PARTITION.
solution:
b., c., d. After the first iteration, $p = i \leq j \leq r$. If the process does not terminate, the index j must decrease such that $j < r$ holds. So at the beginning of kth iteration, we have $p = i \leq j < r$ and $A[p .. i] \leq x$, $A[j .. r] \geq x$. The index j can not run over p because of $A[p .. i] \leq x$, on the other hand, the index i can not run over r either because of $A[j .. r] \geq x$.
e. The algorithm is straightforward.
@{
\proc{Quicksort}(A, p, r)
\li \If p < r
\li \;\;\;\;q = \proc{Hoare-Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}(A, p, q)
\li \;\;\;\;\proc{Quicksort}(A, q + 1, r)
}@
Problem 7-2 Alternative quicksort analysis
An alternative analysis of the running time of randomized quicksort focuses on the expected running time of each individual recursive call to QUICKSORT, rather than on the number of comparisons performed.
a. Argue that, given an array of size n, the probability that any particular element is chosen as the pivot is $1/n$. Use this to define indicator random variables $X_i$ = I{ith smallest element is chosen as the pivot}. What is $E[X_i]$?
b. Let $T(n)$ be a random variable denoting the running time of quicksort on an array of size n. Argue that
$$E[T(n)] = E[\sum_{q = 1}^n X_q (T(q) + T(n - q) + \Theta(n))]\quad (7.5).$$
c. Show that equation (7.5) can be rewritten as
$$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n-1}E[T(q)] + \Theta(n) \quad (7.6).$$
d. Show that
$$\sum_{k = 2}^{n-1} k \lg k \leq \frac{1}{2}n^2 \lg n - \frac{1}{8}n^2 \quad (7.7).$$
(Hint: Split the summation into two parts, one for $k = 2, 3, \ldots , \lceil n/2 \rceil - 1$ and one for $k = \lceil n/2\rceil , \ldots , n - 1$.)
e. Using the bound from equation (7.7), show that the recurrence in equation (7.6) has the solution $E[T(n)] = \Theta(n \lg n)$. (Hint: Show, by substituting, that $E[T(n)] \leq an \lg n$ for sufficiently large n and for some positive constant a.)
solution:
a. The answer is trivial. $E[X_i] = 1/n$.
b. According to equation (7.1), we have $T(n) = T(q) + T(n - q - 1) + \Theta(n)$, where $0 \leq q \leq n - 1$. Let $q^\prime = q + 1$, we can obtain $T(n) = T(q^\prime - 1) + T(n - q^\prime) + \Theta(n)$. By making use of indicator random variable, $$T(n) = \sum_{q^\prime = 1}^{n}X_{q^\prime}(T(q^\prime) + T(n - q^\prime) + \Theta(n)).$$
c. Note that indicator random variables $X_1, X_2, \ldots , X_n$ and random variables $T(1), T(2), \ldots , T(n)$ are mutually independent.
Note that $T(0) = T(1) = 0$, so that we can rewrite to $$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n - 1} E[T(q)] + \Theta(n).$$
d.
e. Assume that $E[T(n)] \leq an \lg n$, and substituing into equation (7.6).
Since we can always find an appropriate constant which can dominate $\Theta(n)$ term, we can get $E[T(n)] = O(n \lg n)$. By exercise 7.4-2, we can conclude that $E[T(n)] = \Theta(n \lg n)$.
Problem 7-3 Stooge sort
Professor Howard, Fine, and Howard have proposed the following "elegant" sorting algorithm:
@{
\proc{Stooge-Sort}(A, i, j)
\li \If A[i] > A[j]
\li \;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \If i + 1 \geq j
\li \;\;\;\;\Return
\li k = \lceil(j - i + 1) / 3\rceil
\li \proc{Stooge-Sort}(A, i, j - k)
\li \proc{Stooge-Sort}(A, i + j, j)
\li \proc{Stooge-Sort}(A, i, j - k)
}@
\proc{Stooge-Sort}(A, i, j)
\li \If A[i] > A[j]
\li \;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \If i + 1 \geq j
\li \;\;\;\;\Return
\li k = \lceil(j - i + 1) / 3\rceil
\li \proc{Stooge-Sort}(A, i, j - k)
\li \proc{Stooge-Sort}(A, i + j, j)
\li \proc{Stooge-Sort}(A, i, j - k)
}@
a. Argue that, if $n = length[A]$, then STOOGE-SORT(A, 1, $length[A]$) correctly sorts the input array $A[1 .. n]$.
b. Give a recurrence for the worst-case running time of STOOGE-SORT and a tight asymptotic ($\Theta$-notation) bound on the worst-case running time.
c. Compare the worst-case running time of STOOGE-SORT with that of insertion sort, merge sort, heapsort, and quicksort. Do the professors deserve tenure?
solution:
b. By the algorithm proposed, we have the recurrence:
$$T(n) = 3T(\frac{2n}{3}) + \Theta(1).$$
Using master theorem, we get
$$T(n) = \Theta(n^{\log_{\frac{3}{2}}3}) \approx \Theta(n^{2.71}).$$
c. It does worse than insertion sort, merge sort, heapsort, and quicksort.
Problem 7-4 Stack depth for quicksort
The QUICKSORT algorithm of Section 7.1 contains two recursive calls to itself. After the call to PARTITION, the left subarray is recursively sorted and then the right subarray is recursively sorted. The second recursive call in QUICKSORT is not really necessary; it can be avoided by using an iterative control structure. This technique, called tail recursion, is provided automatically by good compilers. Consider the following version of quicksort, which simulates tail recursion.
@{
\proc{Quicksort}^{'}(A, p, r)
\li \While p < r
\li \;\;\;\;\textup{// partition and sort left subarray}
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}^{'}(A, p, r)
\li \;\;\;\;p = q + 1
}@
a. Argue that QUICKSORT'(A, 1, $length[A]$) correctly sorts the array A.\proc{Quicksort}^{'}(A, p, r)
\li \While p < r
\li \;\;\;\;\textup{// partition and sort left subarray}
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}^{'}(A, p, r)
\li \;\;\;\;p = q + 1
}@
Compilers usually execute recursive procedures by using a stack that contains pertinent information, including the parameter values, for each recursive call. The information for the most recent call is at the top of the stack, and the information for the initial call is at the bottom. When a procedure is invoked, its information is pushed onto the stack; when it terminates, its information is poped. Since we assume that array parameters are represented by pointers, the information for each procedure call on the stack requires $O(1)$ stack space. The stack depth is the maximum amount of stack space used at any time during a computation.
b. Describe a scenario in which the stack depth of QUICKSORT' is $\Theta(n)$ on an n-element input array.
c. Modify the code for QUICKSORT' so that the worst-case stack depth is $\Theta(\lg n)$. Maintain the $O(n \lg n)$ expected running time of the algorithm.
solution:
b. When the input array is sorted, it will split into two subarrays. One is of size (n-1), and the other is of size 1. In such situation the stack depth is of size $\Theta(n)$.
QUICKSORT(A, p, r)
QUICKSORT(A, p, r-1)
$\vdots$
QUICKSORT(A, p, p+1)c. The problem demonstrated by above scenario is that each invocation of QUICKSORT' calls QUICKSORT' again with almost the same range. If we call QUICKSORT' on smaller subarray returned by PARTITION, we will avoid such problem. Furthermore, the smaller subarray is at most half of the size, we still have $\Theta(\lg n)$ stack depth.
@{
\proc{Quicksort}^{''}(A, p, r)
\li \While p < r
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\If q - p < r - q
\li \;\;\;\;\;\;\;\;\proc{Quicksort}^{''}(A, p, q - 1)
\li \;\;\;\;\;\;\;\;p = q + 1
\li \;\;\;\;\Else \proc{Quicksort}^{''}(A, q + 1, r)
\li \;\;\;\;\;\;\;\;r = q - 1
\proc{Quicksort}^{''}(A, p, r)
\li \While p < r
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\If q - p < r - q
\li \;\;\;\;\;\;\;\;\proc{Quicksort}^{''}(A, p, q - 1)
\li \;\;\;\;\;\;\;\;p = q + 1
\li \;\;\;\;\Else \proc{Quicksort}^{''}(A, q + 1, r)
\li \;\;\;\;\;\;\;\;r = q - 1
}@
Problem 7-5 Median-of-3 partition
One way to improve the RANDOMIZED-QUICKSORT procedure is to partition around a pivot that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose the pivot as the median (middle element) of a set of 3 elements randomly selected from the subarray. (See exercise 7.4-6.) For this problem, let us assume that the elements in the input array $A[1 .. n]$ are distinct and that $n \geq 3$. We denote the sorted output array by $A^{\prime}[1 .. n]$. Using the median-of-3 method to choose the pivot element $x$ define $p_i = \mbox{Pr}\{x = A^{\prime}[i]\}$.
a. Give an exact formula for $p_i$ as a function of n and i for $i = 2, 3, \ldots , n - 1$. (Note that $p_1 = p_n = 0$.)
b. By what amount have we increased the likelihood of choosing the pivot as $x = A^{\prime}[\lfloor (n+1)/2 \rfloor]$, the median of $A[1 .. n]$, compared to the ordinary implementation? Assume that $n \rightarrow \infty$, and give the limiting ratio of these probabilities.
c. If we define a "good" split to mean choosing the pivot as $x = A^{\prime}[i]$, where $n/3 \leq i \leq 2n/3$, by what amount have we increased the likelihood of getting a good split compared to the ordinary implementation? (Hint: Approximate the sum by an integral.)
d. Argue that in the $\Omega(n \lg n)$ running time of quicksort, the median-of-3 method affects only the constant factor.
solution:
a. The answer is straightforward. $p_i = \frac{(i-1)(n-i)}{C_3^n}$
b. When $i = \lfloor (n + 1)/2 \rfloor$, we get $p_{\lfloor (n + 1)/2 \rfloor} = \frac{(\lfloor (n + 1)/2 \rfloor - 1)(n - \lfloor (n + 1)/2 \rfloor)}{C_3^n}$.
If i is odd,
$$p_i = \frac{(\frac{n-1}{2})^2}{\frac{n(n-1)(n-2)}{6}} = \frac{3(n-1)}{2n(n - 2)},$$
and the ratio is
$$\frac{\frac{3(n-1)}{2n(n-2)}}{\frac{1}{n}} = \frac{3(n-1)}{2(n-2)}.$$
Otherwise,
$$p_i = \frac{(\frac{n}{2} - 1)(n - \frac{n}{2})}{\frac{n(n-1)(n-2)}{6}} = \frac{3}{2(n - 1)},$$
and the ratio is
$$\frac{\frac{3}{2(n-1)}}{\frac{1}{n}}.$$
Assume that n approaches infinity, the limting ratio is $1.5$ no matter what n is.
c. By ordinary implementation, the probability of a good split is $1/3$. In median-of-3 implementation, the probability of a good split is
c. By ordinary implementation, the probability of a good split is $1/3$. In median-of-3 implementation, the probability of a good split is
Approximate the summation part by integral, let $t = i-1$, $\int_{1}^{n/3-1}t(n-t-1) dt = \frac{1}{2}nt^2 - \frac{1}{3}t^3 - \frac{1}{2}t^2|^{n/3-1}_{1} = \frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}$.
So, $1-\frac{12}{n(n-1)(n-2)}(\frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}) = 1 + \frac{-\frac{14}{27}n^3 + \frac{10}{3}n^2 - \frac{8}{3}n - 8}{n(n-1)(n-2)}$. When n approaches infinity, the probability is approaching $1 - \frac{14}{27} = \frac{13}{27}$.
d. Even though we always pick the median of the input array as the pivot during the recursion call, the best running time is still $\Omega(n \lg n)$. So the median-of-3 version of quicksort cannot affect the asymptotic running time, it only affect the constant factor.
Problem 7-6 Fuzzy sorting of intervals
Consider a sorting problem in which the numbers are not known exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given n closed intervals of the form $[a_i, b_i]$, where $a_i \leq b_i$. The goal is to fuzzy-sort these intervals, i.e., produce a permutation $\langle i_1, i_2, \ldots , i_n \rangle$ of the intervals such that there exist $c_j \in [a_{i_j}, b_{i_j}]$, satisfying $c_1 \leq c_2 \leq \cdots \leq c_n$.
a. Design an algorithm of fuzzy-sorting n intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the $a_i$s), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals gets easier and easier. Your algorithm should take advantage of such overlapping, to the extent that it exists.)
b. Argue that your algorithm runs in expected time $\Theta(n \lg n)$ in general, but runs in expected time $\Theta(n)$ when all of the intervals overlap (i.e., when there exists a value x such that $x \in [a_i, b_i]$ for all i.) Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.
solution:
a. & b. reference1, reference2
Note that the overlapping intervals do not require sorting. Since for two overlapping intervals $I_i$, $I_j$, we can always find two constants $c_i$, $c_j$ such that $c_i \geq c_j$ or $c_j \geq c_i$. If n intervals overlap, any of $n!$ permutations is a valid fuzzy sorting result. That's why we can speed up the overall running time if more and more intervals overlap.
Suppose that we have n closed intervals denoted by $I = \{[a_1, b_1], [a_2, b_2], \cdots, [a_n, b_n]\}$. We give the following algorithms to solve the fuzzy-sorting problem:
Consider a sorting problem in which the numbers are not known exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given n closed intervals of the form $[a_i, b_i]$, where $a_i \leq b_i$. The goal is to fuzzy-sort these intervals, i.e., produce a permutation $\langle i_1, i_2, \ldots , i_n \rangle$ of the intervals such that there exist $c_j \in [a_{i_j}, b_{i_j}]$, satisfying $c_1 \leq c_2 \leq \cdots \leq c_n$.
a. Design an algorithm of fuzzy-sorting n intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the $a_i$s), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals gets easier and easier. Your algorithm should take advantage of such overlapping, to the extent that it exists.)
b. Argue that your algorithm runs in expected time $\Theta(n \lg n)$ in general, but runs in expected time $\Theta(n)$ when all of the intervals overlap (i.e., when there exists a value x such that $x \in [a_i, b_i]$ for all i.) Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.
solution:
a. & b. reference1, reference2
Note that the overlapping intervals do not require sorting. Since for two overlapping intervals $I_i$, $I_j$, we can always find two constants $c_i$, $c_j$ such that $c_i \geq c_j$ or $c_j \geq c_i$. If n intervals overlap, any of $n!$ permutations is a valid fuzzy sorting result. That's why we can speed up the overall running time if more and more intervals overlap.
Suppose that we have n closed intervals denoted by $I = \{[a_1, b_1], [a_2, b_2], \cdots, [a_n, b_n]\}$. We give the following algorithms to solve the fuzzy-sorting problem:
We adopt a strategy similar to quicksort. We pick an interval as our pivot, and partition the intervals into 3 groups, that is, left group, middle group, and right group. The middle group is intervals which have intersection with the pivot interval. The left group is intervals which are smaller than the pivot interval and have no intersection with the pivot interval. The right group is similar to left group but larger than pivot interval.
In line 1-7, it is similar to PARTITION. At the end of line 7, we have the property that $a_k \leq a_{i+1}$ for $k = p, \ldots , i$ and $a_k > a_{i+1}$ for $k = i+2, \ldots , r$. In line 8-12, we pick up those intervals which not only is smaller than the pivot interval but also overlap the pivot interval. At the end of line 12, we know that $I[p], \cdots , I[q]$ overlap the pivot interval and $I[q+1], \cdots , I[i]$ smaller than pivot interval but have no intersection. In line 13-14, we make $I[p, \ldots , q]$ come closer to the pivot interval by swap $I[p]$ with I[i] and $I[p+1]$ with $I[i-1]$ and so on. In line 16-20, we can also seperate the right group. Finally, the left group is denoted by $I[p, \ldots , q_l - 1]$, and the right group is denoted by $I[q_r + 1, \ldots , r]$.
The running time of subroutine FUZZY-PARTITION is clearly $\Theta(n)$. Hence, the total running time of FUZZY-SORT is $\Theta(n \lg n)$. As more and more intervals overlap, the left group and right group will have fewer and fewer intervals. So the running time will speed up. If all intervals overlap, then the running time is $\Theta(n)$.
In line 1-7, it is similar to PARTITION. At the end of line 7, we have the property that $a_k \leq a_{i+1}$ for $k = p, \ldots , i$ and $a_k > a_{i+1}$ for $k = i+2, \ldots , r$. In line 8-12, we pick up those intervals which not only is smaller than the pivot interval but also overlap the pivot interval. At the end of line 12, we know that $I[p], \cdots , I[q]$ overlap the pivot interval and $I[q+1], \cdots , I[i]$ smaller than pivot interval but have no intersection. In line 13-14, we make $I[p, \ldots , q]$ come closer to the pivot interval by swap $I[p]$ with I[i] and $I[p+1]$ with $I[i-1]$ and so on. In line 16-20, we can also seperate the right group. Finally, the left group is denoted by $I[p, \ldots , q_l - 1]$, and the right group is denoted by $I[q_r + 1, \ldots , r]$.
The running time of subroutine FUZZY-PARTITION is clearly $\Theta(n)$. Hence, the total running time of FUZZY-SORT is $\Theta(n \lg n)$. As more and more intervals overlap, the left group and right group will have fewer and fewer intervals. So the running time will speed up. If all intervals overlap, then the running time is $\Theta(n)$.