In this problem, we prove an $\Omega(n\lg n)$ lower bound on the expected running time of any deterministic or randomized comparison sort on $n$ distinct input elements. We begin by examining a deterministic comparison sort $A$ with decision tree $T_A$. We assume that every permutation of $A$'s inputs is equally likely.
a. Suppose that each leaf of $T_A$ is labeled with the probability that it is reached given a random input. Prove that exactly $n!$ leaves are labeled $1/n!$ and that the rest are labeled 0.
b. Let $D(T)$ denote the external path length of a decision tree $T$; that is, $D(T)$ is the sum of the depths of all the leaves of $T$. Let $T$ be a decision tree with $k > 1$ leaves, and let $LT$ and $RT$ be the left and right subtrees of $T$. Show that $D(T) = D(LT) + D(RT) + k$.
c. Let $d(k)$ be the minimum value of $D(T)$ over all decision trees $T$ with $k > 1$ leaves. Show that $d(k) = \min_{1 \leq i \leq k-1}\{d(i) + d(k-i) + k\}$. (Hint: Consider a decision tree $T$ with $k$ leaves that achieves the minimum. Let $i_0$ be the number of leaves in $LT$ and $k - i_0$ the number of leaves in $RT$.)
d. Prove that for a given value of $k > 1$ and $i$ in the range $1 \leq i \leq k-1$, the function $i\lg i + (k-i)\lg(k-i)$ is minimized at $i = k/2$. Conclude that $d(k) = \Omega(k\lg k)$.
e. Prove that $D(T_A) = \Omega(n!\lg(n!))$, and conclude that the expected time to sort $n$ elements is $\Omega(n\lg n)$.
Now consider a randomized comparison sort $B$. We can extend the decision-tree model to handle randomization by incorporating two kinds of nodes: ordinary comparison nodes and "randomization" nodes. A randomization node models a random choice of the form RANDOM(1, $r$) made by algorithm $B$; the node has $r$ children, each of which is equally likely to be chosen during an execution of the algorithm.
f. Show that for any randomized comparison sort $B$, there exists a deterministic comparison sort $A$ that makes no more comparisons on the average than $B$ does.
solution:
a.Since all elements are distinct, there are $n!$ permutations of input of $n$ elements and each permutation is equally likely. Therefore, $T_A$ must have at least $n!$ leaves and each of these $n!$ leaves has the same probability. The probability of the rest of leaves are 0, since $1 - n! \times 1/n! = 0$.
b.We define depth($l$, $T$) to be the depth of leaf $l$ of tree $T$. For any leaf $l$ of $T$, if $l$ belongs to $RT$ then depth($l$, $T$) = depth($l$, $RT$) + 1. It is true for $l$ belonging to $LT$. So we can show
c. Consider a decision tree $T$ with $k$ leaves that achieves the minimum. Then $d(k) = D(T)$. By b., we know that $D(T) = D(LT) + D(RT) + k$, thus $d(k) = D(LT) + D(RT) + k$, where $LT$, $RT$ are the left and right subtree of $T$ respectively. We claim that if $T$ achieves the minimum and $LT$ has $i_0$ leaves and $RT$ has $k - i_0$ leaves, then $D(LT) = d(i_0)$ and $D(RT) = d(k - i_0)$. Otherwise, we can construct a decision tree $T'$ by a left subtree $LT'$ with $i_0$ leaves and a right subtree $RT'$ with $k - i_0$ leaves, where $D(LT') = d(i_0)$ and $D(RT') = d(k - i_0)$ such that $D(T') \leq D(T)$. Hence, to find the desired decision tree $T$, we can construct a decision tree iteratively by a left subtree $LT$ such that $D(LT) = d(i_0)$ and a right subtree $RT$ such that $D(RT) = d(k - i_0)$ for $i_0 = 1, 2, \cdots, k-1$ to find the minimum one. So, $d(k) = \min_{1 \leq i \leq k-1}\{d(i) + d(k - i) + k\}$.
d. First, we provide a proof to find the minimum.
We can find the minimum by AM-GM inequality. Since $i\lg i \geq 0$ and $(k-i)\lg(k-i) \geq 0$, then $i\lg i + (k-i)\lg(k-i) \geq 2\sqrt{i\lg i \times (k-i)\lg(k-i)}$. The AM and GM are equal if and only if $i\lg i = (k-i)\lg(k-i)$. Apparently, $i = k/2$ meets the requirement, and by root-finding algorithm ,$i = k/2$ is the only solution. Then, the minimum is $2\sqrt{i\lg i \times (k-i)\lg(k-i)} = 2\sqrt{[(k/2)\lg(k/2)]^2} = k\lg(k/2)$.
Next, we are going to prove that $d(k) = \Theta(k\lg k)$.
Suppose that $d(k) \leq c_1k\lg k$ for some constant $c_1$.
So we can choose a $c_1 \geq 1$, such that $d(k) \leq c_1k\lg k$.
Suppose that $d(k) \geq c_2k\lg k$ for some constant $c_2$.
So we can choose a $c_2 \leq 1$ such that $d(k) \geq c_2k\lg k$. So we can conclude that $d(k) = \Theta(k \lg k)$.
e. Suppose that $T_A$ has $k$ leaves, where $k \geq n!$. By definition, $D(T_A) \geq d(k) = \Theta(k\lg k) \geq \Theta(n!\lg n!) = \Omega(n!\lg n!)$. The expected running time $E[T_A(n)] \geq D(T_A)/n! = \Omega(n!\lg n!) / n! = \Omega(\lg n!) = \Omega(n\lg n)$.
f. We modify the randomized decision tree to define a deterministic decision tree. At each randomized node, pick the child with the smallest subtree. Delete all the other children and splice out the randomized node itself. The deterministic algorithm corresponding to this modified tree still works, because the randomized algorithm worked no matter which path was taken from each randomized node. The average number of comparisons for the modified algorithm is no larger than the average number for the original randomized tree, since we discarded the high-average subtrees in each case.
Problem 8-2 Sorting in place in linear time
Suppose that we have an array of $n$ data records to sort and that the key of each record has the value 0 or 1. An algorithm for sorting such a set of records might possess some subset of the following three desirable characteristics:
1. The algorithm runs in $O(n)$ time.
2. The algorithm is stable.
3. The algorithm sorts in place, using no more than a constant amount of storage space in addition to the original array.
a. Give an algorithm that satisfies criteria 1 and 2 above.
b. Give an algorithm that satisfies criteria 1 and 3 above.
c. Give an algorithm that satisfies criteria 2 and 3 above.
d. Can any of your sorting algorithms from parts (a)-(c) be used to sort $n$ records with b-bit keys using radix sort in $O(bn)$ time? Explain how or why not.
e. Suppose that the $n$ records have keys in the range from 1 to k. Show how to modify counting sort so that the record can be sorted in place in $O(n+k)$ time. You may use $O(k)$ storage outside the input array. Is your algorithm stable? (Hint: How would you do it for $k=3$ ?)
solution:
a. counting sort
b. partition
c. insertion sort
d. part (a) -- counting sort. Counting sort runs in $O(n)$ times, and for b-bit keys with each bit value varies from 0 to 1, we can sort in $O(b(n + 2)) = O(bn)$ time.
e.We propose a modified counting sort algorithm that not only sorts in place in $O(n+k)$ time but also is stable. The algorithm requires two arrays of length $k$.
@{
\proc{Counting-Sort}^{'}(A, k)
\li \textup{initialize}\;C1[0..k] = 0
\li \For j = 1 \;\textbf{to}\;\textit{length}[A]
\li \;\;\;\;C1[A[j]] = C1[A[j]] + 1
\li \For i = 1 \;\textbf{to}\;k
\li \;\;\;\;C1[i] = C1[i] + C1[i - 1]
\li \For i = 0\;\textbf{to}\;k
\li \;\;\;\;C2[i] = C1[i]
\li \For i = \textit{length}[A]\;\textbf{downto}\; 1
\li \;\;\;\;\If i > C2[A[i]]
\li \;\;\;\;\;\;\;\;a = A[i]
\li \;\;\;\;\;\;\;\;\textbf{do}
\li \;\;\;\;\;\;\;\;\;\;\;\;j = C1[a]
\li \;\;\;\;\;\;\;\;\;\;\;\;b = A[j]
\li \;\;\;\;\;\;\;\;\;\;\;\;A[j] = a
\li \;\;\;\;\;\;\;\;\;\;\;\;C1[a] = C1[a] - 1
\li \;\;\;\;\;\;\;\;\;\;\;\;a = b
\li \;\;\;\;\;\;\;\;\While C1[a] \not= i
\li \;\;\;\;\;\;\;\;A[i] = a
}@
C1 acts just like array C in the original counting sort algorithm. C2 is the additional array that keeps the important information. C2[i] is the maximum index that i can take up. While we look up each element in A in a reverse way, if we find out that A[i] > C2[A[i]], A[i] must be put in a wrong position. Then we put A[i] in a correct position by C1[A[i]]. Once each element is put into its correct position, it will never be accessed anymore. Thus, the total running time is $O(n + k)$, where $k$ is the time we set up the array C1 and C2.\proc{Counting-Sort}^{'}(A, k)
\li \textup{initialize}\;C1[0..k] = 0
\li \For j = 1 \;\textbf{to}\;\textit{length}[A]
\li \;\;\;\;C1[A[j]] = C1[A[j]] + 1
\li \For i = 1 \;\textbf{to}\;k
\li \;\;\;\;C1[i] = C1[i] + C1[i - 1]
\li \For i = 0\;\textbf{to}\;k
\li \;\;\;\;C2[i] = C1[i]
\li \For i = \textit{length}[A]\;\textbf{downto}\; 1
\li \;\;\;\;\If i > C2[A[i]]
\li \;\;\;\;\;\;\;\;a = A[i]
\li \;\;\;\;\;\;\;\;\textbf{do}
\li \;\;\;\;\;\;\;\;\;\;\;\;j = C1[a]
\li \;\;\;\;\;\;\;\;\;\;\;\;b = A[j]
\li \;\;\;\;\;\;\;\;\;\;\;\;A[j] = a
\li \;\;\;\;\;\;\;\;\;\;\;\;C1[a] = C1[a] - 1
\li \;\;\;\;\;\;\;\;\;\;\;\;a = b
\li \;\;\;\;\;\;\;\;\While C1[a] \not= i
\li \;\;\;\;\;\;\;\;A[i] = a
}@
Problem 8-3 Sorting variable-length items
a. You are given an array of integers, where different integers may have different numbers of digits, but the total number of digits over all the integers in the array is $n$. Show how to sort the array in $O(n)$ time.
b. You are given an array of strings, where different strings may have different numbers of characters, but the total number of characters over all the string is $n$. Show how to sort the strings in $O(n)$ time. (Note that the desired order here is the standard alphabetical order; for example, a < ab < b.)
solution:
a. Without loss of generality, we assume that all integers are positive. Since numbers with longer bits are always greater than ones with shorter bits, we can group all the numbers by their bit length. The longest bit length is n, and the shortest bit length is 1. Let $m_i$ be the number of integers of i bits. We know that $\sum_{i = 1}^n i \cdot m_i = n$. We can look over each number to figure out how many bits does it have, and it takes $O(n)$ time. Next, for each group, we can sort the numbers by radix sort. The total time is $O(n)$.
b. When a string x and a string y are compared, if the first letter of a string x is lexicographically greater than the first letter of a string y, then x is lexicographically greater than y. We can sort the strings on the first letter, and gather strings with the same first letter into groups. The empty strings are taken out and put first. Then, the sorting is recursive within each group after the first letter is removed. Finally, all strings will be sorted.
We analyze the running time of the sorting process. The $m_i$ denotes the number of strings of i characters. And we have $\sum_{i = 1}^n i \cdot m_i = n$. For a string s with l characters, it will be sorted by counting sort l times. For example, if there are $t_i$ strings with i characters and the total characters over all strings is T, then the total cost to call counting sort is $\sum_i O(t_i) = O(\sum_i t_i) = O(T)$. Thus, the overall running time is $O(n)$.
Problem 8-4 Water jugs
Suppose that you are given n red and n blue water jugs, all of different shapes and sizes. All red jugs hold different amounts of water, as do the blue ones. Moreover, for every red jug, there is a blue jug that holds the same amount of water, and vice versa.
It is your task to find a grouping of the jugs into pair of red and blue jugs that hold the same amount of water. To do so, you may perform the following operation: pick a pair of jugs in which one is red and one is blue, fill the red jug with water, and then pour the water into the blue jug. This operation will tell you whether the red or the blue jug can hold more water, or if they are of the same volume. Assume that such a comparison takes one time unit. Your goal is to find an algorithm that makes a minimum number of comparisons to determine the grouping. Remember that you may not directly compare two red jugs or two blue jugs.
a. Describe a deterministic algorithm that uses $\Theta(n^2)$ comparisons to group the jugs into pairs.
b. Prove a lower bound of $\Omega(n\lg n)$ for the number of comparisons an algorithm solving this problem must take.
c. Give a randomized algorithm whose expected number of comparisons is $O(n\lg n)$, and prove that this bound is correct. What is the worst-case number of comparisons for your algorithm.
solution:
a. For each red jug, we pick every blue jug to compare directly. The algorithm is a double-nested for loop, so the total running time is $O(n^2)$.
b. We can view this problem as a decision tree model. Each node is labeled with two jugs, and the leaves are labeled with a unique matching. Each node has three children, each stands for red jug is smaller, equal size, and red jug is larger. Let h be the height of the decision tree, then we can obtain $3^h \geq n! \Rightarrow h \lg 3 \geq \lg n! \Rightarrow h \geq \Omega(n\lg n)$.
c. We propose a randomized algorithm whose expected number of comparison is $O(n\lg n)$.
@{
\proc{Match-Jugs}(R,B)
\li \If |R|=0
\li \;\;\;\;\Return
\li \If |R|=1
\li \;\;\;\;\textup{let}\;R=\{r\},B=\{b\}
\li \;\;\;\;\textup{output "}(r,b)\textup{"}
\li \;\;\;\;\Return
\li \Else
\li \;\;\;\;r=\;\textup{a randomly chosen jug in}\;R
\li \;\;\;\;B_{<}=\{b | b \in B, \;\textup{and}\;b < r\}
\li \;\;\;\;B_{>}=\{b | b \in B, \;\textup{and}\;b > r\}
\li \;\;\;\;b = \;\textup{the one jug in}\;B\;\textup{with the same size as}\;r
\li \;\;\;\;R_{<} = \{r | r \in R, \;\textup{and}\;r < b\}
\li \;\;\;\;R_{>} = \{r | r \in R, \;\textup{and}\;r > b\}
\li \;\;\;\;\textup{output "}(r,b)\textup{"}
\li \;\;\;\;\proc{Match-Jugs}(R_{<}, B_{<})
\li \;\;\;\;\proc{Match-Jugs}(R_{>}, B_{>})
}@
What about the running time? Let us order the jugs as $r_1, r_2, \ldots, r_n$ and $b_1, b_2, \ldots, b_n$ where $r_i < r_{i+1}$ and $b_i < b_{i+1}$ for $i = 1, 2, \ldots, n$, and $r_i = b_i$. Our analysis uses indicator random variables
$X_{ij} = \mbox{I}\{\mbox{red jug } r_i \mbox{ is compared to blue jug } b_j\}}$.
As in quicksort, a given pair $r_i$ and $b_j$ is compared at most once. When we compare $r_i$ to every jug in B, jug $r_i$ will not be put in either $R_<$ or $R_>$. When we compare $b_i$ to every jug in $R-\{r_i\}$, jug $b_i$ is not put into either $B_<$ or $B_>$. The total number of comparison is
$$X = \sum_{i = 1}^{n-1}\sum_{j = i + 1}^{n}X_{ij}.$$
To calculate the expected value of X, we follow the quicksort analysis to arrive at
$$E[X] = \sum_{i = 1}^{n-1}\sum_{j = i + 1}^{n} \mbox{Pr}\{r_i \mbox{ is compared to } b_j\}.$$
As in the quicksort analysis, once we choose a jug $r_k$ such that $r_i < r_k < b_j$, we will put $r_i$ in $R_<$ and $b_j$ in $R_>$, and so $r_i$ and $b_j$ will never be compared again. Let us denote $R_{ij} = \{r_i, \ldots, r_j\}$. Then jugs $r_i$ and $b_j$ will be compared if and only if the first jug in $R_{ij}$ to be chosen is either $r_i$ or $r_j$.
Still following the quicksort analysis, until a jug from $R_{ij}$ is chosen, the entire set $R_{ij}$ is together. Any jug in $R_{ij}$ is equally likely to be first one chosen. Since $|R_{ij}| = j - i + 1$, the probability of any given jug being the first one chosen in $R_{ij}$ is $1 / (j - i + 1)$. The remainder of the analysis is the same as the quicksort analysis, and we arrive at the solution of $O(n\lg n)$ comparisons.
Just like in quicksort, in the worst case we always choose the largest (or smallest) jug to partition the sets, which reduces the set sizes by only 1. The running time then obeys the recurrence $T(n) = T(n - 1) + \Theta(n)$, and the number of comparisons we make in the worst case is $T(n) = \Theta(n^2)$.
Problem 8-5 Average sorting
Suppose that, instead of sorting an array, we just require that the elements increase on average. More precisely, we call an n-element array A k-sorted if, for all $i = 1, 2, \ldots, n-k$, the following holds:
$$\frac{\sum_{j = i}^{i + k - 1}A[j]}{k} \leq \frac{\sum_{j = i + 1}^{i + k}A[j]}{k}.$$
a. What does it mean for an array to be 1-sorted?
b. Give a permutation of the numbers 1, 2, ..., 10 that is 2-sorted, but not sorted.
c. Prove that an n-element array is k-sorted if and only if $A[i] \leq A[i + k]$ for all $i = 1, 2, \ldots, n-k$.
d. Giva an algorithm that k-sorts an n-element array in $O(n\lg(n/k))$time.
We can also show a lower bound on the time to produce a k-sorted array, when k is a constant.
e. Show that a k-sorted array of length n can be sorted in $O(n\lg k)$ time. (Hint: Use the solution to Exercise 6.5-8.)
f. Show that when k is a constant, it requires $\Omega(n\lg n)$ time to k-sort an n-element array. (Hint: Use the solution to the previous part along with the lower bound on comparison sorts.)
solution:
a. By definition, $A[i] \leq A[i+1]$ for $i = 1, 2, \ldots, n$.
b. [2, 1, 4, 3, 6, 5, 8, 7, 10, 9]
c. array A is k-sorted if, for all i,
d. We can modify quicksort algorithm to perform k-sorting an array. The line 1 in QUICKSORT is modified to if p + (k - 1) < r. The algorithm stops splitting when the length of the array is less than or equal to k. Because the array is split such that items in left hand side are less than right hand side, we always satisfy the condition in the above problem. The recursion tree will have height of $\Theta(\lg(n/k))$. So the total running time is $O(n\lg(n/k))$.
e. Without loss of generality, we assume that $n = km$. That is, an n-element array can be divide into m group of length k. The first element of each group can form a sorted list. The second element of each group form a sorted list as well and so on. By Exercise 6.5-8, we can merge these $(n/k)$ sorted list into a sorted list in $O(n\lg(n/k))$ time.
f. Consider a decision tree model that simulates k-sorting an array. Its nodes are just like the original decision tree we know. Except that its leaves label with a unique match of k-sorted result. Let the height of the decision tree be h. Permutations of k-sort array is
$$C_k^n\times C_k^{n-k}\times\cdots C_k^k = \frac{n!}{k!(n-k!)}\times\frac{(n-k)!}{k!(n-2k)!}\times\cdots\times\frac{k!}{0!k!} = \frac{n!}{(k!)^{n/k}}.$$
So we have
$$2^h \geq \frac{n!}{(k!)^{n/k}} \geq \frac{(n/e)^n}{(k!)^{n/k}} = (\frac{n}{e(k!^{1/k})})^n.$$
Since k is a constant, we can obtain that
$$h \geq n\lg\frac{n}{c} = \Omega(n\lg n),$$
where $$c = e(k!)^{1/k}$$ is a constant.
Therefore, it requires $\Omega(n\lg n)$ time to k-sort an n-element array.
Problem 8-6 Lower bound on merging sorted lists
The problem of merging two sorted lists arises frequently. It is used as a subroutine of MERGE-SORT, and the procedure to merge two sorted lists is given as MERGE in section 2.3.1. In this problem, we will show that there is a lower bound of $2n-1$ on the worst-case number of comparisons required to merge two sorted lists, each containing n items.
First we will show a lower bound of $2n - o(n)$ comparisons by using a decision tree.
a. Show that, given 2n numbers, there are $C^{2n}_n$ possible ways to divide them into two sorted lists, each with n numbers.
b. Using a decision tree, show that any algorithm that correctly merges two sorted lists uses at least $2n - o(n)$ comparisons.
Now we will show a slightly tighter $2n - 1$ bound.
c. Show that if two elements are consecutive in the sorted order and from opposite lists, then they must be compared.
d. Use your answer to the previous part to show a lower bound of $2n - 1$ comparisons for merging two sorted lists.
solution:
a. trivial
b. Let the height of the decision tree be h. The decision tree contains at least $C^{2n}_n$ leaves.
By Stirling's formula, we know that
$$n! = \sqrt{2\pi n}(\frac{n}{e})^ne^{\alpha_n},$$
where
$$\frac{1}{12n+1} < \alpha_n < \frac{1}{12n}.$$
Then,
c. Let the two sorted array be A and B, and $a_1, a_2, \ldots, a_n$ and $b_1, b_2, \ldots, b_n$ are elements of A and B respectively.Suppose that $a_i$ and $b_j$ are adjacent in final sorted list.
$\cdots, a_{i-1}, a_i, a_{i+1}, \cdots$
$\cdots, b_{j-1}, b_j, b_{j+1}, \cdots$
If $b_j$ compares to $a_k$ where $k = 1, \ldots, i-1$, the result must be $a_k \leq b_j$ but we are not able to determine the whether $b_j$ is greater than $a_i$ or not. Similarly, if $b_j$ compares to $a_K$ where $K = i+1, \ldots, n$, we cannot know that if $b_j$ is greater than $a_i$ or not, either. Therefore, the only way we figure out the order of $b_j$ and $a_i$ in the final sorted list is comparing $b_j$ to $a_i$ directly.
d. We consider a worst-case to determine the lower bound. Let $a_1, a_2, \ldots, a_{2n}$ be a sorted list, and $A = \langle a_1, a_3, \ldots, a_{2n-1}\rangle$, $B = \langle a_2, a_4, \ldots, a_{2n}\rangle$. By problem c, we know that $a_i$ must compare to $a_{i+1}$ in the merging process. So, there must be at least $2n-1$ comparisons.
沒有留言:
張貼留言