rightwayman's blog: 2010

2010年12月29日星期三

[Algo] Exercise 9.3

Exercise 9.3-1
In the algorithm SELECT, the input elements are divided into groups of 5. Will the algorithm work in linear time if they are divided into groups of 7? Argue that SELECT does not run in linear time if groups of 3 are used.
solution:
(1) groups of 7 are used
# of elements greater than the median of median is at least
$$4(\lceil\frac{1}{2}\lceil\frac{n}{7}\rceil\rceil - 2) \geq \frac{2n}{7} - 8$$.
The recurrence can be rewritten as
$$T(n) \leq T(\lceil\frac{n}{7}\rceil) + T(\frac{5n}{7} + 8) + O(n)$$.
The process of proving is similar to the textbook, and we will prove that $T(n) = O(n)$ finally.

(2) groups of 3 are used
# of elements greater than the median of median is at least
$$2(\lceil\frac{1}{2}\lceil\frac{n}{3}\rceil\rceil - 2) \geq \frac{n}{3} - 4$$.
The recurrence can be rewritten as
$$T(n) \leq T(\lceil\frac{n}{3}\rceil) + T(\frac{2n}{3} + 4) + O(n)$$.
Before we derive the following steps, we can glance at the equation and find out the right hand side $n/3 + 2n/3 + 4 \geq n$. Suppose that $T(n) \leq cn$ for some positive constant c. By substitution, we have
$\begin{align*} T(n) &\leq c \left \lceil n/3 \right \rceil + c(2n/3 + 4) + an\\ &\leq cn/3 + c + 2cn/3 + 4c + an\\ &= cn + 5c + an \end{align*}$
Since $5c + an \geq 0$, we cannot derive that $T(n) \leq cn$. Hence, SELECT does not run in linear time if groups of 3 are used.

Exercise 9.3-2
Analyze SELECT to show that if $n \geq 140$, then at least $\lceil n/4 \rceil$ elements are greater than the median-of-medians x and at least $\lceil n/4 \rceil$ elements are less than x.
solution:
Suppose that at most $\lceil n/4 \rceil$ elements are greater than x, then we can obtain that
$$\lceil n/4 \rceil \geq \frac{3n}{10} - 6 \Rightarrow \frac{n}{4} + 1 > \frac{3n}{10} - 6 \Rightarrow n < 140$$.
It contradicts $n \geq 140$! Hence, we can prove that if $n \geq 140$, then at least $\lceil n/4 \rceil$ elements are greater than the median-of-medians x and at least $\lceil n/4 \rceil$ elements are less than x.

Exercise 9.3-3
Show how quicksort can be made to run in $O(n\lg n)$ time in the worst case, assuming that all elements are distinct.
solution:
Before PARTITION is performed, we call SELECT to pick up the median of the input array. The median is used for a pivot to split the array into two subarrays. Because of the median, it guarantees the left hand side and right hand side must be of the same length. Let $T(n)$ be the running time needed by the modified quicksort to perform on input array of n elements. We can derive the recurrence as follows: $T(n) \leq 2T(n/2) + \Theta(n)$. By master theorem, $T(n) = \Theta(n\lg n)$.

MODIFIED-PARTITION(A, p, r)

1  n = r - p + 1

2  find an index k by calling SELECT(A, p, r, $\lfloor(n+1)/2\rfloor$) such that A[k] = x, where x is the median
3  exchange A[r] and A[k]
4  return PARTITION(A, p, r)

Exercise 9.3-4
Suppose that an algorithm uses only comparisons to find the ith smallest element in a set of n elements. Show that it can also find the i -1 smaller elements and n - i larger elements without performing any additional comparisons.
solution:
Suppose that the algorithm uses m comparisons to find the ith smallest element x in a set of n elements. If we trace these m comparisons in the log, we can easily find the i -1 smaller elements by transitivity. If x > a and a > b, then x > b can be deduced without actually performing a comparison between x and b. Similarly, the n - i larger elements can be found, too. Is it possible there exists a number, say p, we cannot decide whether x is greater than p or not by the comparison log? That's impossible! Otherwise, the algorithm does not work correctly.

Exercise 9.3-5
Suppose that you have a "black-box" worst-case linear-time median subroutine. Give a simple, linear-time algorithm that solves the selection problem for an arbitrary order statistic.
solution:

MODIFIED-SELECT(A, p, r, i)

1  if p == r

2    return A[p]

3  x = MEDIAN(A, p, r)

4  q = PARTITION'(A, p, r, x)

5  k = q - p + 1

6  if i == k

7    return A[q]

8  else if i < k

9    return MODIFIED-SELECT(A, p, q-1, i)

10 else

11   return MODIFIED-SELECT(A, q+1, r, i-k)

PARTITION' is a deterministic PARTITION subroutine. It uses parameter x as a pivot to split the array. Since the median is found to split the array, the subarrays will be of the same size. $$T(n) \leq T(n/2) + O(n) \Rightarrow T(n) = O(n)$$.

Exercise 9.3-6
The kth quantiles of an n-element set are the k-1 order statistics that divide the sorted set into k equal-sized sets (to within 1). Give an $O(n\lg k)$-time algorithm to list the kth quantiles of a set.
solution:
The k quantiles of an n-element array A are
$A[\lceil 1\cdot n/k \rceil], A[\lceil 2\cdot n/k \rceil],\cdots,A[\lceil (k-1)\cdot n/k \rceil]$.
The following algorithm finds kth quantiles of an array A, and these (k-1) elements are put just like the positions they reside if entire range is sorted.

QUANTILE-BASE(A, p, r, Q, k)

1  if k <= 0

2    return

3  else if k == 1

4    x = SELECT(A, p, r, Q[1])

5    PARTITION(A, p, r, x) 

6  else

7    i = $\lfloor (k + 1)/2 \rfloor$

8    x = SELECT(A, p, r, Q[i])

9    q = PARTITION(A, p, r, x)

10   QUANTILE-BASE(A, p, q - 1, Q[1..(i-1)], i - 1) 

11   for j = (i + 1) to length[Q]

12     Q[j] = Q[j] - Q[i]

13   QUANTILE-BASE(A, q + 1, r, Q[(i+1)..length[Q]], k - i)

QUANTILE(A, p, r, k)

1  if k <= 1

2    return

3  n = r - p + 1 

4  QUANTILE-BASE(A, p, r, $[\lceil 1\cdot n/k \rceil, \lceil 2\cdot n/k \rceil,\cdots,\lceil (k-1)\cdot n/k \rceil]$, k - 1)

Next, we prove the running time is $O(n\lg k)$. Let $T(n, k)$ be the running time of the algorithm needs to find k-quantiles of an n-element array. We have,

$\left\{\begin{array}{ll} T(n, k) = T(\lceil\lfloor \frac{k+1}{2} \rfloor \frac{n}{k}\rceil-1, \lfloor\frac{k+1}{2}\rfloor-1) + T(n-\lceil\lfloor \frac{k+1}{2} \rfloor \frac{n}{k}\rceil, k - 1 - \lfloor\frac{k+1}{2}\rfloor + 1) + O(n) & \mbox{if } k > 1\\ T(n, k) = O(n) & \mbox{otherwise}\\ \end{array} \right.$
Have no energy to draw the recursion tree to illustrate the complexity. Maybe tomorrow...

Exercise 9.3-7
Describe an $O(n)$-time algorithm that, given a set S of n distinct numbers and a positive integer $k \leq n$, determines the k numbers in S that are closest to the median of S.
solution:
The median x can be found by the procedure SELECT in $O(n)$ time. After finding the median x, we call the a modified procedure SELECT' to find the kth element. In the modified procedure SELECT', the elements, for instance $a, b$ are compared by the distance to the median x($$|a-x| - |b-x|$$). The comparison takes $\Theta(1)$ time. So the modified SELECT' still runs in $O(n)$ time. After the kth element is found by SELECT', it is used as the pivot to partition the set S, and S[1..k] is the k numbers that are closeset to the median.

Exercise 9.3-8
Let $X[1..n]$ and $Y[1..n]$ be two arrays, each containing n numbers already in sorted order. Give an $O(\lg n)$-time algorithm to find the median of all 2n elements in array X and Y.
solution:
Let Z be the union of array X and array Y and in sorted order. The median m of Z is the nth order statistics, that is, m must be greater than exactly (n-1) numbers. Suppose that m is in the array X, then we can claim that there must exist an index p such that $X[p] = m$ and $Y[n-p] \leq X[p] \leq Y[n-p+1]$. Why? X[p] is greater than exactly p-1 numbers in X. And $Y[n-p] \leq X[p] \leq Y[n-p+1]$, then X[p] is greater than exactly n-p numbers in Y. Thus, X[p] is greater than exactly n-1 numbers in total and X[p] must be the median m.
The following algorithm can help us find the median m. It tries to find the median in X first. If it fails, the median must be in Y. The procedure then tries to find the median in Y. Since the median must be either in X or Y. The procedure would not be called infinitely.
FIND-MEDIAN(X, Y, start_index, end_index, n)
1 if start_index > end_index
2 return FIND-MEDIAN(Y, X, 1, n, n)
3 p = (start_index + end_index)/2
4 a = X[p]
5 if a >= Y[n - p] and a <= Y[n - p + 1]
6 return X[p]
7 else if a < Y[n - p]
8 return FIND-MEDIAN(X, Y, p+1, end_index, n)
9 else
10 return FIND-MEDIAN(X, Y, start_index, p-1, n)

Then, we are going to show that the algorithm needs $O(\lg n)$ time. Each time the procedure is called, the distance between start_index and end_index is divided by 2. If the median is in X, the running time is at most $O(\lg n)$. Otherwise, it continues to search into Y, and the running time is $O(\lg n) + O(\lg n) = O(\lg n)$.

Exercise 9.3-9
Professor Olay is consulting for an oil company, which is planning a large pipeline running east to west through an oil field of n wells. From each well, a spur pipeline is to be connected directly to the main pipeline along a shortest path (either north or south), as shown in Figure 9.2. Given x- and y-coordinates of the wells, how should the professor pick the optimal location of the main pipeline (the one that minimizes the total length of the spurs)? Show that the optimal location can be determined in linear time.
solution:
Let $(x_1, y_1), (x_2, y_2), \cdots, (x_n, y_n)$ be the coordinates of n wells. We want to find a number $y^*$ such that $$d(y^*) = \sum_{i = 1}^n |y_i - y^*|$$ achieves the minimum. Note that $$|y_i - y^*|$$ is the distance between $y_i$ and $y^*$. The median of $y_1, y_2, \ldots, y_n$ will always have less value than any other fixed number substituted into function $d(\cdot)$. So the optimal location can be determined by SELECT in $O(n)$ time.

[Algo] 9.3 Selection in worst-case linear time

SELECT 的原理
這一個 section 要提出另一個演算法叫作 SELECT，它的時間複雜度也是 $O(n)$，不過它跟 RANDOMIZED-SELECT 最大的不同就是它保證每次的 partition 都是好的 split。
假設 input array 的長度是 n(n > 1)，如果長度為1的話，那 SELECT 會回傳它唯一的那個元素，這也是 recursive call 終結的條件。
1. 把這個 array 每 5 個 elements 就分成一個 group，所以會有 $\lfloor n/5 \rfloor$ 個長度為 5 的 groups，然後剩下不足 5 的 elements 就自己獨立成一個 group。
2. 各組分開作 insertion sort，然後把各組的 median 取出來變成一個 array。
3. 把這個 median 的 array，call 一次 SELECT，找出這個 array 的 median。我們把這個"median 的 median"叫作 x。
4. 以 x 當作 pivot，把整個 input array 跑過一次 PARTITION，分成左右兩邊，假設左邊有 k - 1 個 elements，中間是 pivot x，那麼右邊就應該會有 n - k 個 elements。
5. 如果要找 ith order statistic 剛好是 pivot x 的話，那就回傳 x，否則我們看這個 order statistic 是落在右邊還是左邊，我們就 call SELECT 下去找。(當然啦，如果在右邊的話，就要找 (i - k)th order statistic)

SELECT 的時間複雜度

如上圖，每個 element 都表示成一個點，每一個 column 就代表一個 group，白色的點是該 group 的 median，標示成 x 的就是 median 的 median。因為我們假設每個 element 都是相異的，所以至少有一半以上的 median 都會比 x 還要來的大。除了 x 自己這個 group 和最後一個 group 之外，其他每個 group 都會有 3 個 elements 比 x 還要大，所以我們可以得到：至少有
$$3(\lceil\frac{1}{2}\lceil\frac{n}{5}\rceil\rceil - 2) \geq \frac{3n}{10} - 6$$
個 elements 比 x 大，同理，有 $$\frac{3n}{10} - 6$$ 個 elements 比 x 小。因此，SELECT 在第 5 個步驟的 recursive call 最多只有 $$\frac{7n}{10} + 6$$ 個 elements 而已。
令 $T(n)$ 為 SELECT 跑在 array 長度為 n 時所需要的時間，並且假設 $T(n)$ 是遞增的。
第一步需要 $O(n)$ 的時間，
第二步需要 $O(n)$ 的時間(不用懷疑，5 個 elements 的 insertion sort 我們可以視為 $O(1)$)，
第三步需要 $T(\lceil n/5 \rceil)$ 的時間，
第四步需要 $O(n)$ 的時間，
第五步最多需要 $T(7n/10 + 6)$ 的時間，
於是我們可以整理出如下的遞迴式：
$T(n) \leq \left\{\begin{array}{ll} \Theta(1) & \mbox{if } n \leq 140\\ T(\lceil n/5 \rceil) + T(7n/10 + 6) + O(n) & \mbox{if } n > 140 \end{array}\right.$
看到上面的 140 了嗎？這個神奇的數字怎麼來的呢？慢慢來。
還是利用 substitution method，假設 $T(n) \leq cn$，$O(n) \leq an$，其中 a, c 各為一個常數。接著代進去原來的式子中：
$\begin{align*} T(n) &\leq c \left \lceil n/5 \right \rceil + c(7n/10 + 6) + an\\ &\leq cn/5 + c + 7cn/10 + 6c + an\\ &= 9cn/10 + 7c + an\\ &= cn + (-cn/10 + 7c + an) \end{align*}$
只要 $-cn/10 + 7c + an \leq 0$，$T(n)$ 最多就是 $cn$ 而已。所以 $c \geq 10a(n/(n-70))$ 只要在 $n \geq 70$的情況下都會滿足，而且因為我們假設 $n \geq 140$，我們會得到 $n/(n-70) \leq 2$，所以只要 $c \geq 20a$ 的話，$-cn/10 + 7c + an$ 就會小於等於 0。所以其實 140 這個數字並不是那麼的神奇，我們可以任意指定它為一個大於 70 的數字，然後再為此挑一個適當的 c 就可以了。因此我們得證 SELECT 是 $O(n)$ 的演算法。

2010年12月28日星期二

[Algo] Exercise 9.2

Exercise 9.2-1
Show that in RANDOMIZED-SELECT, no recursive call is ever made to a 0-length array.
solution:
When a 0-length array is made by partition, k must be either 1 or n. Recursive call will not be made to a 0-length array unless we want 0-order statistic or (n+1)th order statistic.

Exercise 9.2-2
Argue that the indicator random variable $X_k$ and the value $T(\max(k-1, n-k))$ are independent.
solution:
Trivial. Occurrence of $X_k$ makes the time it takes to do the recursive call neither more nor less.

Exercise 9.2-3
Write an iterative version of RANDOMIZED-SELECT.
solution:

ITERATIVE-RANDOMIZED-SELECT(A, p, r, i)

1  while true

2    q = RANDOMIZED-PARTITION(A, p, r)

3    k = q - p + 1

4    if i == k

5      then return A[q]

6    else if i < k

7      then r = q - 1

8    else

9      p = q + 1

10     i = i - k

Exercise 9.2-4
Suppose we use RANDOMIZED-SELECT to select the minimum element of the array $A = \langle 3, 2, 9, 0, 7, 5, 4, 8, 6, 1\rangle$. Describe a sequence of partitions that results in a worst-case performance of RANDOMIZED-SELECT.
solution:
[3, 2, 9, 0, 7, 5, 4, 8, 6, 1] --> [3, 2, 0, 7, 5, 4, 8, 6, 1] + [9]
[3, 2, 0, 7, 5, 4, 8, 6, 1] --> [3, 2, 0, 7, 5, 4, 6, 1] + [8]
[3, 2, 0, 7, 5, 4, 6, 1] --> [3, 2, 0, 5, 4, 6, 1] + [7]
[3, 2, 0, 5, 4, 6, 1] --> [3, 2, 0, 5, 4, 1] + [6]
[3, 2, 0, 5, 4, 1] --> [3, 2, 0, 4, 1] + [5]
[3, 2, 0, 4, 1] --> [3, 2, 0, 1] + [4]
[3, 2, 0, 1] --> [2, 0, 1] + [3]
[2, 0, 1] --> [0, 1] + [2]
[0, 1] --> [0] + [1]
[0]

[Algo] 9.2 Selection in expected linear time

RANDOMIZED-SELECT(A, p, r, i)

1 if p == r

2    then return A[p]

3  q = RANDOMIZED-PARTITION(A, p, r)

4  k = q - p + 1

5  if i == k

6    then return A[q]

7  else if i < k

8    then return RANDOMIZED-SELECT(A, p, q-1, i)

9  else

10   return RANDOMIZED-SELECT(A, q+1, r, i-k)

RANDOMIZED-PARTITION 我們在 [Algo]7.3 就已經看過了。我們接下來先分析一下 RANDOMIZED-SELECT。一開始 A[q] 被當作 pivot，因為我們知道 A[q+1 .. r] 裏面的所有 item 都比 A[q] 還要來的大，所以如果我們要挑的 ith order statistic 比 A[q] 小的話，那我們就去 A[p..q-1] 裏面挑，否則的話我們就去 A[q+1..r] 裏面挑。不同於 quicksort，它是 partition 完之後左右兩邊都要繼續遞迴下去，這裡只有一邊會繼續遞迴下去。
RANDOMIZED-SELECTION 的 average-case 是 $\Theta(n)$，worst-case 是在我們每次都很晦氣的只把 A[p..r] 分成一邊為 n-1，另一邊為 1 的狀況，這個時候我們的複雜度就是 $\Theta(n^2)$，這樣即使只是要找 minimum 或 maximum 都很不划算。但是因為這個演算法是 randomized 的，所以沒有任何一種特殊的 input 會導致到這種 worst-case，除非我們運氣真的非常不好。

running time of RANDOMIZED-SELECTION
$T(n)$：對一個長度為 n 的 array A[p..r] 執行 RANDOMIZED-SELECTION 所要花費的 running time
$X_k$：一個 indicator random variable，定義為：$X_k$ = I{ the subarray A[p..q] has exactly k elements }，因為我們假設 A[p..q] 中每一個 element 被選為 pivot 的機率都一樣，所以 $E[X_k] = 1/n$。
接下來我們要求的是 $E[T(n)]$ 的值。根據演算法，我們可以知道：
$$T(n) \leq \sum_{k = 1}^n X_k\cdot(T(\max(k-1, n-k)) + O(n))$$
$$E[T(n)] \leq E[\sum_{k = 1}^n X_k\cdot(T(\max(k-1, n-k)) + O(n))]$$
$$= \sum_{k = 1}^n E[X_k\cdot(T(\max(k-1, n-k))] + O(n)$$
$$= \sum_{k = 1}^n E[X_k]\cdotE[T(\max(k-1, n-k))] + O(n)$$ ($X_k$ 跟 $T(\max(k-1, n-k))$ 是彼此獨立的)
$$= \sum_{k = 1}^n \frac{1}{n}\cdotE[T(\max(k-1, n-k))] + O(n)$$
在 $k > \lceil n/2 \rceil$ 的時後，$\max(k-1, n-k) = k-1$，在 $k \leq \lceil n/2 \rceil$ 的時候，$\max(k-1, n-k) = n-k$。當 n 是偶數的時候從 $T(\lceil n/2 \rceil)$ 到 $T(n-1)$ 都會重複兩次，當 n 是奇數的時候，除了 $T(\lfloor n/2 \rfloor)$ 只出現過一次之外，其他每一項都會出現兩次，所以我們可以知道：
$$E[T(n)] \leq \frac{2}{n}\sum_{k=\lfloor n/2 \rfloor}^{n-1}E[T(k)] + O(n)$$

接下來，我們用 substitution method 來解這個遞迴式，假設 $T(n) \leq cn$，其中 c 為一個常數。再假設 $O(n)$ 這一項被 $an$ 給 bound 住，其中 a 為一個常數。
$$E[T(n)] \leq \frac{2}{n}\sum_{k=\lfloor n/2 \rfloor}^{n-1}ck + an = \frac{2c}{n}(\sum_{k = 1}^{n-1}k - \sum_{k = 1}^{\lfloor n/2 \rfloor - 1}k) + an$$
$$= \frac{2c}{n}(\frac{(n-1)n}{2} - \frac{(\lfloor n/2 \rfloor -1)\lfloor n/2 \rfloor}{2}) + an \leq \frac{2c}{n}(\frac{(n-1)n}{2} - \frac{(n/2 -2)(n/2 - 1)}{2}) + an$$
$$=\frac{2c}{n}(\frac{n^2 - n}{2} - \frac{n^2/4 - 3n/2 + 2}{2}) + an = \frac{c}{n}(\frac{3n^2}{4} + \frac{n}{2} - 2) + an$$
$$=c(\frac{3n}{4} + \frac{1}{2} - \frac{2}{n}) + an \leq \frac{3cn}{4} + \frac{c}{2} + an = cn - (\frac{cn}{4} - \frac{c}{2} - an)$$
為了要完成這個證明，我們要保證$cn/4 - c/2 - an \geq 0$ 在 n 足夠大的時候。
$$\frac{cn}{4} - \frac{c}{2} - an \geq 0 \Rightarrow n(c/4 - a) \geq c/2$$
為了要讓它大於 0，所以 $c > 4a$，此時
$$n \geq \frac{c/2}{c/4-a} = \frac{2c}{c-4a}$$
所以只要我們假設在 $n < 2c/(c-4a)$ 的時候 $T(n) = O(1)$，我們就可以得到 $T(n) = O(n)$

[Algo] Exercise 9.1

Exercise 9.1-1
Show that the second smallest of n elements can be found with $n + \lceil\lg n\rceil - 2$ comparisons in the worst case. (Hint: Also find the smallest element.)
solution:
Imagine that each comparison is a match, the minimum is the final champion. As we have already known, it requires at least $n - 1$ comparisons to determine who is the winner. The second place must have been compared to the minimum and defeated. To find who is the second smallest, we can trace each match the minimum played. Gather all these defeated items and pick the smallest among them, it is the very second smallest. The minimum must play $\lceil \lg n\rceil$ games, so there are $\lceil \lg n\rceil$ candidates. To find the second smallest, we require another $\lceil \lg n\rceil - 1$ games. Therefore, there are $n - 1 + \lceil \lg n\rceil - 1 = n + \lceil \lg n\rceil - 2$ comparisons.

Exercise 9.1-2
Show that $\lceil 3n/2 \rceil - 2$ comparisons are necessary in the worst case to find both the maximum and minimum of n numbers. (Hint: Consider how many numbers are potentially either the maximum or minimum, and investigate how a comparison affects these counts.)
solution:
Initially, there are n potential items of maximum candidates and minimum candidates. If $a_i$ and $a_j$ are simultaneous in the candidate set of maximum and minimum respectively, after $a_i$ compares to $a_j$ the cardinality of candidate set of maximum and minimum will decrease by 1. If $a_i$ and $a_j$ are in the candidate set of maximum or minimum, after $a_i$ compares to $a_j$ the cardinality of the set will decrease by 1.
(1) when n is even:
After $n/2$ comparisons, the candidate set of maximum and minimum will remain $n/2$ candidates. Next, it requires $n/2 - 1$ comparisons to determine the maximum and another $n/2 - 1$ comparisons to determine the minimum. Total comparisons needed is $3n/2 - 2 = \lceil 3n/2 \rceil -2$.
(2) when n is odd:
After $(n-1)/2$ comparisons, the candidate set of maximum and minimum will remain $(n+1)/2$ candidates. Next, it requires at most $(n+1)/2 - 1$ comparisons to determine the maximum and another at most $(n+1)/2 - 1$ comparisons to determine the minimum. Total comparisons needed is $3n/2 - 3/2 = \lceil 3n/2 \rceil -2$.

[Algo] 9.1 Minimum and maximum

定義
order statistic:
The ith order statistic of a set of n elements is the ith smallest element.
minimum:
The minimum of a set of elements is the first order statistic (i = 1).
maximum:
The maximum is the nth order statistic (i = n).
lower median:
$i = \lfloor (n + 1)/2 \rfloor$
For simplicity, we consistently use the phrase "the median" to refer to the lower median.
upper median:
$i = \lceil (n + 1)/2 \rceil$

到底要做過幾次 comparison 才能知道誰是 minimum？

@{
\proc{Minimum}(A)
\li min = A[1]
\li \For i = 2 \;\textbf{to}\;\textit{length}[A]
\li \;\;\;\;\If min > A[i]
\li \;\;\;\;\;\;\;\;min = A[i]
\li \Return min
}@

Lower bound: n-1 comparisons
對任何一個演算法來講，決定誰是 minimum 就像一場巡迴賽一樣。想像每一次 comparison 都是一場比賽，兩者中比較小的為獲勝者，每一場比賽都會有一個輸家。除了 minimum 這個人之外，其他每個人一定曾經輸過一場比賽，所以總共至少會有 n-1 場比賽(因為每一個人都要輸過一場比賽嘛)，也就是至少會有 n-1 次 comparison。

Simultaneous minimum and maximum
用一個 $\Theta(n)$ 的演算法在一個 array 裏，同時找出 minimum 和 maximum 並不難。但是其實"最多"只要 $3\lfloor n/2 \rfloor$ 次的 comparison 就足夠了。

@{
\proc{Simultaneous-Min-Max}(A)
\li \If \;\textit{length}[A]\;\textup{is odd}
\li \;\;\;\;max\;\textup{and}\;min\;\textup{are initialized to}\;A[1]
\li \Else
\li \;\;\;\;max\;\textup{and}\;min\;\textup{are initialized by comparing}\;A[1]\;\textup{and}\;A[2]
\li i = 2
\li \While i + 1 \leq \textit{length}[A]
\li \;\;\;\;M = \textup{the larger of}\;A[i]\;\textup{and}\;A[i+1]
\li \;\;\;\;m = \textup{the smaller of}\;A[i]\;\textup{and}\;A[i+1]
\li \;\;\;\;\If M > max
\li \;\;\;\;\;\;\;\;max = M
\li \;\;\;\;\If m < min
\li \;\;\;\;\;\;\;\;min = m
\li \;\;\;\;i = i + 2
\li \Return max \;\textup{and}\;min
}@

我們來分析一下這個演算法的複雜度，
(1) 如果 n 是奇數，那 min 跟 max 都被初始為 A[1]，接下來每兩個就先做一次 comparison，比較大者就去跟 max 作 comparison，比較小的就去跟 min 作 comparison，所以每一對 A[i] 和 A[i+1] 都會有 3 次 comparison，所以會有 $3 \times (n-1)/2 = 3\lfloor n/2 \rfloor$ 次的 comparison。
(2) 如果 n 是偶數，A[1] 跟 A[2] 先做一次 comparison，那 min 跟 max 先依照 A[1] 跟 A[2] 比較的結果被初始，接下來每兩個就做一次 comparison，道理跟上面差不多。所以每一對 A[i] 和 A[i+1] 都會有 3 次 comparison，所以會有 $3 \times (n-2)/2 + 1 = 3(n/2) - 2 \leq 3\lfloor n/2 \rfloor$ 次的 comparison。

2010年12月24日星期五

[Algo] Problems 8

Problem 8-1 Average-case lower bounds on comparison sorting
In this problem, we prove an $\Omega(n\lg n)$ lower bound on the expected running time of any deterministic or randomized comparison sort on $n$ distinct input elements. We begin by examining a deterministic comparison sort $A$ with decision tree $T_A$. We assume that every permutation of $A$'s inputs is equally likely.
a. Suppose that each leaf of $T_A$ is labeled with the probability that it is reached given a random input. Prove that exactly $n!$ leaves are labeled $1/n!$ and that the rest are labeled 0.
b. Let $D(T)$ denote the external path length of a decision tree $T$; that is, $D(T)$ is the sum of the depths of all the leaves of $T$. Let $T$ be a decision tree with $k > 1$ leaves, and let $LT$ and $RT$ be the left and right subtrees of $T$. Show that $D(T) = D(LT) + D(RT) + k$.
c. Let $d(k)$ be the minimum value of $D(T)$ over all decision trees $T$ with $k > 1$ leaves. Show that $d(k) = \min_{1 \leq i \leq k-1}\{d(i) + d(k-i) + k\}$. (Hint: Consider a decision tree $T$ with $k$ leaves that achieves the minimum. Let $i_0$ be the number of leaves in $LT$ and $k - i_0$ the number of leaves in $RT$.)
d. Prove that for a given value of $k > 1$ and $i$ in the range $1 \leq i \leq k-1$, the function $i\lg i + (k-i)\lg(k-i)$ is minimized at $i = k/2$. Conclude that $d(k) = \Omega(k\lg k)$.
e. Prove that $D(T_A) = \Omega(n!\lg(n!))$, and conclude that the expected time to sort $n$ elements is $\Omega(n\lg n)$.

Now consider a randomized comparison sort $B$. We can extend the decision-tree model to handle randomization by incorporating two kinds of nodes: ordinary comparison nodes and "randomization" nodes. A randomization node models a random choice of the form RANDOM(1, $r$) made by algorithm $B$; the node has $r$ children, each of which is equally likely to be chosen during an execution of the algorithm.
f. Show that for any randomized comparison sort $B$, there exists a deterministic comparison sort $A$ that makes no more comparisons on the average than $B$ does.
solution:
a.Since all elements are distinct, there are $n!$ permutations of input of $n$ elements and each permutation is equally likely. Therefore, $T_A$ must have at least $n!$ leaves and each of these $n!$ leaves has the same probability. The probability of the rest of leaves are 0, since $1 - n! \times 1/n! = 0$.

b.We define depth($l$, $T$) to be the depth of leaf $l$ of tree $T$. For any leaf $l$ of $T$, if $l$ belongs to $RT$ then depth($l$, $T$) = depth($l$, $RT$) + 1. It is true for $l$ belonging to $LT$. So we can show
$\begin{align*} D(T) &= \sum_{\mbox{leaf } l \mbox{ of T}} \mbox{depth}(l, T)\\ &= \sum_{\mbox{leaf } l_L \mbox{ of LT}} (\mbox{depth}(l_L, LT) + 1) + \sum_{\mbox{leaf } l_R \mbox{ of RT}} \mbox{depth}(l_R, RT) + 1 \\ &= D(LT) + D(RT) + k \end{align*}$

c. Consider a decision tree $T$ with $k$ leaves that achieves the minimum. Then $d(k) = D(T)$. By b., we know that $D(T) = D(LT) + D(RT) + k$, thus $d(k) = D(LT) + D(RT) + k$, where $LT$, $RT$ are the left and right subtree of $T$ respectively. We claim that if $T$ achieves the minimum and $LT$ has $i_0$ leaves and $RT$ has $k - i_0$ leaves, then $D(LT) = d(i_0)$ and $D(RT) = d(k - i_0)$. Otherwise, we can construct a decision tree $T'$ by a left subtree $LT'$ with $i_0$ leaves and a right subtree $RT'$ with $k - i_0$ leaves, where $D(LT') = d(i_0)$ and $D(RT') = d(k - i_0)$ such that $D(T') \leq D(T)$. Hence, to find the desired decision tree $T$, we can construct a decision tree iteratively by a left subtree $LT$ such that $D(LT) = d(i_0)$ and a right subtree $RT$ such that $D(RT) = d(k - i_0)$ for $i_0 = 1, 2, \cdots, k-1$ to find the minimum one. So, $d(k) = \min_{1 \leq i \leq k-1}\{d(i) + d(k - i) + k\}$.

d. First, we provide a proof to find the minimum.
We can find the minimum by AM-GM inequality. Since $i\lg i \geq 0$ and $(k-i)\lg(k-i) \geq 0$, then $i\lg i + (k-i)\lg(k-i) \geq 2\sqrt{i\lg i \times (k-i)\lg(k-i)}$. The AM and GM are equal if and only if $i\lg i = (k-i)\lg(k-i)$. Apparently, $i = k/2$ meets the requirement, and by root-finding algorithm ,$i = k/2$ is the only solution. Then, the minimum is $2\sqrt{i\lg i \times (k-i)\lg(k-i)} = 2\sqrt{[(k/2)\lg(k/2)]^2} = k\lg(k/2)$.
Next, we are going to prove that $d(k) = \Theta(k\lg k)$.
Suppose that $d(k) \leq c_1k\lg k$ for some constant $c_1$.
$\begin{align*} d(k) &= \min_{1 \leq i \leq k-1}\{d(i) + d(k - i) + k\}\\ &\leq \min_{1 \leq i \leq k-1}\{c_1i\lg i + c_1(k - i)\lg(k-i) + k\}\\ &= c_1\cdot k\lg(k/2) + k\\ &= c_1k\lg k + (k - c_1k) \end{align*}$
So we can choose a $c_1 \geq 1$, such that $d(k) \leq c_1k\lg k$.
Suppose that $d(k) \geq c_2k\lg k$ for some constant $c_2$.
$\begin{align*} d(k) &= \min_{1 \leq i \leq k-1}\{d(i) + d(k - i) + k\}\\ &\geq \min_{1 \leq i \leq k-1}\{c_2i\lg i + c_2(k - i)\lg(k-i) + k\}\\ &= c_2\cdot k\lg(k/2) + k\\ &= c_2k\lg k + (k - c_2k) \end{align*}$
So we can choose a $c_2 \leq 1$ such that $d(k) \geq c_2k\lg k$. So we can conclude that $d(k) = \Theta(k \lg k)$.

e. Suppose that $T_A$ has $k$ leaves, where $k \geq n!$. By definition, $D(T_A) \geq d(k) = \Theta(k\lg k) \geq \Theta(n!\lg n!) = \Omega(n!\lg n!)$. The expected running time $E[T_A(n)] \geq D(T_A)/n! = \Omega(n!\lg n!) / n! = \Omega(\lg n!) = \Omega(n\lg n)$.

f. We modify the randomized decision tree to define a deterministic decision tree. At each randomized node, pick the child with the smallest subtree. Delete all the other children and splice out the randomized node itself. The deterministic algorithm corresponding to this modified tree still works, because the randomized algorithm worked no matter which path was taken from each randomized node. The average number of comparisons for the modified algorithm is no larger than the average number for the original randomized tree, since we discarded the high-average subtrees in each case.

Problem 8-2 Sorting in place in linear time
Suppose that we have an array of $n$ data records to sort and that the key of each record has the value 0 or 1. An algorithm for sorting such a set of records might possess some subset of the following three desirable characteristics:
1. The algorithm runs in $O(n)$ time.
2. The algorithm is stable.
3. The algorithm sorts in place, using no more than a constant amount of storage space in addition to the original array.
a. Give an algorithm that satisfies criteria 1 and 2 above.
b. Give an algorithm that satisfies criteria 1 and 3 above.
c. Give an algorithm that satisfies criteria 2 and 3 above.
d. Can any of your sorting algorithms from parts (a)-(c) be used to sort $n$ records with b-bit keys using radix sort in $O(bn)$ time? Explain how or why not.
e. Suppose that the $n$ records have keys in the range from 1 to k. Show how to modify counting sort so that the record can be sorted in place in $O(n+k)$ time. You may use $O(k)$ storage outside the input array. Is your algorithm stable? (Hint: How would you do it for $k=3$ ?)
solution:
a. counting sort
b. partition
c. insertion sort
d. part (a) -- counting sort. Counting sort runs in $O(n)$ times, and for b-bit keys with each bit value varies from 0 to 1, we can sort in $O(b(n + 2)) = O(bn)$ time.
e.We propose a modified counting sort algorithm that not only sorts in place in $O(n+k)$ time but also is stable. The algorithm requires two arrays of length $k$.

@{
\proc{Counting-Sort}^{'}(A, k)
\li \textup{initialize}\;C1[0..k] = 0
\li \For j = 1 \;\textbf{to}\;\textit{length}[A]
\li \;\;\;\;C1[A[j]] = C1[A[j]] + 1
\li \For i = 1 \;\textbf{to}\;k
\li \;\;\;\;C1[i] = C1[i] + C1[i - 1]
\li \For i = 0\;\textbf{to}\;k
\li \;\;\;\;C2[i] = C1[i]
\li \For i = \textit{length}[A]\;\textbf{downto}\; 1
\li \;\;\;\;\If i > C2[A[i]]
\li \;\;\;\;\;\;\;\;a = A[i]
\li \;\;\;\;\;\;\;\;\textbf{do}
\li \;\;\;\;\;\;\;\;\;\;\;\;j = C1[a]
\li \;\;\;\;\;\;\;\;\;\;\;\;b = A[j]
\li \;\;\;\;\;\;\;\;\;\;\;\;A[j] = a
\li \;\;\;\;\;\;\;\;\;\;\;\;C1[a] = C1[a] - 1
\li \;\;\;\;\;\;\;\;\;\;\;\;a = b
\li \;\;\;\;\;\;\;\;\While C1[a] \not= i
\li \;\;\;\;\;\;\;\;A[i] = a
}@

C1 acts just like array C in the original counting sort algorithm. C2 is the additional array that keeps the important information. C2[i] is the maximum index that i can take up. While we look up each element in A in a reverse way, if we find out that A[i] > C2[A[i]], A[i] must be put in a wrong position. Then we put A[i] in a correct position by C1[A[i]]. Once each element is put into its correct position, it will never be accessed anymore. Thus, the total running time is $O(n + k)$, where $k$ is the time we set up the array C1 and C2.

Problem 8-3 Sorting variable-length items
a. You are given an array of integers, where different integers may have different numbers of digits, but the total number of digits over all the integers in the array is $n$. Show how to sort the array in $O(n)$ time.
b. You are given an array of strings, where different strings may have different numbers of characters, but the total number of characters over all the string is $n$. Show how to sort the strings in $O(n)$ time. (Note that the desired order here is the standard alphabetical order; for example, a < ab < b.)
solution:
a. Without loss of generality, we assume that all integers are positive. Since numbers with longer bits are always greater than ones with shorter bits, we can group all the numbers by their bit length. The longest bit length is n, and the shortest bit length is 1. Let $m_i$ be the number of integers of i bits. We know that $\sum_{i = 1}^n i \cdot m_i = n$. We can look over each number to figure out how many bits does it have, and it takes $O(n)$ time. Next, for each group, we can sort the numbers by radix sort. The total time is $O(n)$.

b. When a string x and a string y are compared, if the first letter of a string x is lexicographically greater than the first letter of a string y, then x is lexicographically greater than y. We can sort the strings on the first letter, and gather strings with the same first letter into groups. The empty strings are taken out and put first. Then, the sorting is recursive within each group after the first letter is removed. Finally, all strings will be sorted.
We analyze the running time of the sorting process. The $m_i$ denotes the number of strings of i characters. And we have $\sum_{i = 1}^n i \cdot m_i = n$. For a string s with l characters, it will be sorted by counting sort l times. For example, if there are $t_i$ strings with i characters and the total characters over all strings is T, then the total cost to call counting sort is $\sum_i O(t_i) = O(\sum_i t_i) = O(T)$. Thus, the overall running time is $O(n)$.

Problem 8-4 Water jugs
Suppose that you are given n red and n blue water jugs, all of different shapes and sizes. All red jugs hold different amounts of water, as do the blue ones. Moreover, for every red jug, there is a blue jug that holds the same amount of water, and vice versa.
It is your task to find a grouping of the jugs into pair of red and blue jugs that hold the same amount of water. To do so, you may perform the following operation: pick a pair of jugs in which one is red and one is blue, fill the red jug with water, and then pour the water into the blue jug. This operation will tell you whether the red or the blue jug can hold more water, or if they are of the same volume. Assume that such a comparison takes one time unit. Your goal is to find an algorithm that makes a minimum number of comparisons to determine the grouping. Remember that you may not directly compare two red jugs or two blue jugs.
a. Describe a deterministic algorithm that uses $\Theta(n^2)$ comparisons to group the jugs into pairs.
b. Prove a lower bound of $\Omega(n\lg n)$ for the number of comparisons an algorithm solving this problem must take.
c. Give a randomized algorithm whose expected number of comparisons is $O(n\lg n)$, and prove that this bound is correct. What is the worst-case number of comparisons for your algorithm.
solution:
a. For each red jug, we pick every blue jug to compare directly. The algorithm is a double-nested for loop, so the total running time is $O(n^2)$.

b. We can view this problem as a decision tree model. Each node is labeled with two jugs, and the leaves are labeled with a unique matching. Each node has three children, each stands for red jug is smaller, equal size, and red jug is larger. Let h be the height of the decision tree, then we can obtain $3^h \geq n! \Rightarrow h \lg 3 \geq \lg n! \Rightarrow h \geq \Omega(n\lg n)$.

c. We propose a randomized algorithm whose expected number of comparison is $O(n\lg n)$.
@{
\proc{Match-Jugs}(R,B)
\li \If |R|=0
\li \;\;\;\;\Return
\li \If |R|=1
\li \;\;\;\;\textup{let}\;R=\{r\},B=\{b\}
\li \;\;\;\;\textup{output "}(r,b)\textup{"}
\li \;\;\;\;\Return
\li \Else
\li \;\;\;\;r=\;\textup{a randomly chosen jug in}\;R
\li \;\;\;\;B_{<}=\{b | b \in B, \;\textup{and}\;b < r\}
\li \;\;\;\;B_{>}=\{b | b \in B, \;\textup{and}\;b > r\}
\li \;\;\;\;b = \;\textup{the one jug in}\;B\;\textup{with the same size as}\;r
\li \;\;\;\;R_{<} = \{r | r \in R, \;\textup{and}\;r < b\}
\li \;\;\;\;R_{>} = \{r | r \in R, \;\textup{and}\;r > b\}
\li \;\;\;\;\textup{output "}(r,b)\textup{"}
\li \;\;\;\;\proc{Match-Jugs}(R_{<}, B_{<})
\li \;\;\;\;\proc{Match-Jugs}(R_{>}, B_{>})
}@

What about the running time? Let us order the jugs as $r_1, r_2, \ldots, r_n$ and $b_1, b_2, \ldots, b_n$ where $r_i < r_{i+1}$ and $b_i < b_{i+1}$ for $i = 1, 2, \ldots, n$, and $r_i = b_i$. Our analysis uses indicator random variables
$X_{ij} = \mbox{I}\{\mbox{red jug } r_i \mbox{ is compared to blue jug } b_j\}}$.
As in quicksort, a given pair $r_i$ and $b_j$ is compared at most once. When we compare $r_i$ to every jug in B, jug $r_i$ will not be put in either $R_<$ or $R_>$. When we compare $b_i$ to every jug in $R-\{r_i\}$, jug $b_i$ is not put into either $B_<$ or $B_>$. The total number of comparison is
$$X = \sum_{i = 1}^{n-1}\sum_{j = i + 1}^{n}X_{ij}.$$
To calculate the expected value of X, we follow the quicksort analysis to arrive at
$$E[X] = \sum_{i = 1}^{n-1}\sum_{j = i + 1}^{n} \mbox{Pr}\{r_i \mbox{ is compared to } b_j\}.$$
As in the quicksort analysis, once we choose a jug $r_k$ such that $r_i < r_k < b_j$, we will put $r_i$ in $R_<$ and $b_j$ in $R_>$, and so $r_i$ and $b_j$ will never be compared again. Let us denote $R_{ij} = \{r_i, \ldots, r_j\}$. Then jugs $r_i$ and $b_j$ will be compared if and only if the first jug in $R_{ij}$ to be chosen is either $r_i$ or $r_j$.
Still following the quicksort analysis, until a jug from $R_{ij}$ is chosen, the entire set $R_{ij}$ is together. Any jug in $R_{ij}$ is equally likely to be first one chosen. Since $|R_{ij}| = j - i + 1$, the probability of any given jug being the first one chosen in $R_{ij}$ is $1 / (j - i + 1)$. The remainder of the analysis is the same as the quicksort analysis, and we arrive at the solution of $O(n\lg n)$ comparisons.
Just like in quicksort, in the worst case we always choose the largest (or smallest) jug to partition the sets, which reduces the set sizes by only 1. The running time then obeys the recurrence $T(n) = T(n - 1) + \Theta(n)$, and the number of comparisons we make in the worst case is $T(n) = \Theta(n^2)$.

Problem 8-5 Average sorting
Suppose that, instead of sorting an array, we just require that the elements increase on average. More precisely, we call an n-element array A k-sorted if, for all $i = 1, 2, \ldots, n-k$, the following holds:
$$\frac{\sum_{j = i}^{i + k - 1}A[j]}{k} \leq \frac{\sum_{j = i + 1}^{i + k}A[j]}{k}.$$
a. What does it mean for an array to be 1-sorted?
b. Give a permutation of the numbers 1, 2, ..., 10 that is 2-sorted, but not sorted.
c. Prove that an n-element array is k-sorted if and only if $A[i] \leq A[i + k]$ for all $i = 1, 2, \ldots, n-k$.
d. Giva an algorithm that k-sorts an n-element array in $O(n\lg(n/k))$time.

We can also show a lower bound on the time to produce a k-sorted array, when k is a constant.
e. Show that a k-sorted array of length n can be sorted in $O(n\lg k)$ time. (Hint: Use the solution to Exercise 6.5-8.)
f. Show that when k is a constant, it requires $\Omega(n\lg n)$ time to k-sort an n-element array. (Hint: Use the solution to the previous part along with the lower bound on comparison sorts.)
solution:
a. By definition, $A[i] \leq A[i+1]$ for $i = 1, 2, \ldots, n$.

b. [2, 1, 4, 3, 6, 5, 8, 7, 10, 9]

c. array A is k-sorted if, for all i,
$\begin{align*} &\Leftrightarrow \frac{\sum_{j = i}^{i + k - 1}A[j]}{k} \leq \frac{\sum_{j = i + 1}^{i + k}A[j]}{k}\\ &\Leftrightarrow \sum_{j = i}^{i + k - 1}A[j] \leq \sum_{j = i + 1}^{i + k}A[j]\\ &\Leftrightarrow A[i] + A[i + 1] + \cdots + A[i + k - 1] \leq A[i + 1] + A[i + 2] + \cdots + A[i + k]\\ &\Leftrightarrow A[i] \leq A[i+k] \end{align*}$

d. We can modify quicksort algorithm to perform k-sorting an array. The line 1 in QUICKSORT is modified to if p + (k - 1) < r. The algorithm stops splitting when the length of the array is less than or equal to k. Because the array is split such that items in left hand side are less than right hand side, we always satisfy the condition in the above problem. The recursion tree will have height of $\Theta(\lg(n/k))$. So the total running time is $O(n\lg(n/k))$.

e. Without loss of generality, we assume that $n = km$. That is, an n-element array can be divide into m group of length k. The first element of each group can form a sorted list. The second element of each group form a sorted list as well and so on. By Exercise 6.5-8, we can merge these $(n/k)$ sorted list into a sorted list in $O(n\lg(n/k))$ time.

f. Consider a decision tree model that simulates k-sorting an array. Its nodes are just like the original decision tree we know. Except that its leaves label with a unique match of k-sorted result. Let the height of the decision tree be h. Permutations of k-sort array is
$$C_k^n\times C_k^{n-k}\times\cdots C_k^k = \frac{n!}{k!(n-k!)}\times\frac{(n-k)!}{k!(n-2k)!}\times\cdots\times\frac{k!}{0!k!} = \frac{n!}{(k!)^{n/k}}.$$
So we have
$$2^h \geq \frac{n!}{(k!)^{n/k}} \geq \frac{(n/e)^n}{(k!)^{n/k}} = (\frac{n}{e(k!^{1/k})})^n.$$
Since k is a constant, we can obtain that
$$h \geq n\lg\frac{n}{c} = \Omega(n\lg n),$$
where $$c = e(k!)^{1/k}$$ is a constant.
Therefore, it requires $\Omega(n\lg n)$ time to k-sort an n-element array.

Problem 8-6 Lower bound on merging sorted lists
The problem of merging two sorted lists arises frequently. It is used as a subroutine of MERGE-SORT, and the procedure to merge two sorted lists is given as MERGE in section 2.3.1. In this problem, we will show that there is a lower bound of $2n-1$ on the worst-case number of comparisons required to merge two sorted lists, each containing n items.
First we will show a lower bound of $2n - o(n)$ comparisons by using a decision tree.
a. Show that, given 2n numbers, there are $C^{2n}_n$ possible ways to divide them into two sorted lists, each with n numbers.
b. Using a decision tree, show that any algorithm that correctly merges two sorted lists uses at least $2n - o(n)$ comparisons.
Now we will show a slightly tighter $2n - 1$ bound.
c. Show that if two elements are consecutive in the sorted order and from opposite lists, then they must be compared.
d. Use your answer to the previous part to show a lower bound of $2n - 1$ comparisons for merging two sorted lists.
solution:
a. trivial
b. Let the height of the decision tree be h. The decision tree contains at least $C^{2n}_n$ leaves.
By Stirling's formula, we know that
$$n! = \sqrt{2\pi n}(\frac{n}{e})^ne^{\alpha_n},$$
where
$$\frac{1}{12n+1} < \alpha_n < \frac{1}{12n}.$$
Then,
$\begin{align*} 2^h &\geq C^{2n}_n = \frac{(2n)!}{n!n!}\\ &\geq \frac{\sqrt{4\pi n}(\frac{2n}{e})^{2n}}{[\sqrt{2\pi n}(\frac{n}{e})^ne^{\frac{1}{12n}}]^2}\\ &\geq \frac{(\frac{2n}{e})^{2n}}{2\pi n(\frac{n}{e})^{2n}e^{\frac{1}{6n}}}\\ &\geq \frac{2^{2n}}{2\pi n\cdot e^{\frac{1}{6n}}}\\ \end{align*}$
$\begin{align*} &\Rightarrow h \geq 2n - \lg 2\pi n - \frac{1}{6n}\lg e = 2n - o(n)\\\\ &\lim_{n \rightarrow \infty}\frac{\lg 2\pi n + \frac{1}{6n}\lg e}{n} = 0\\ &\Rightarrow \lg 2\pi n + \frac{1}{6n}\lg e \in o(n) \end{align*}$

c. Let the two sorted array be A and B, and $a_1, a_2, \ldots, a_n$ and $b_1, b_2, \ldots, b_n$ are elements of A and B respectively.Suppose that $a_i$ and $b_j$ are adjacent in final sorted list.
$\cdots, a_{i-1}, a_i, a_{i+1}, \cdots$
$\cdots, b_{j-1}, b_j, b_{j+1}, \cdots$
If $b_j$ compares to $a_k$ where $k = 1, \ldots, i-1$, the result must be $a_k \leq b_j$ but we are not able to determine the whether $b_j$ is greater than $a_i$ or not. Similarly, if $b_j$ compares to $a_K$ where $K = i+1, \ldots, n$, we cannot know that if $b_j$ is greater than $a_i$ or not, either. Therefore, the only way we figure out the order of $b_j$ and $a_i$ in the final sorted list is comparing $b_j$ to $a_i$ directly.

d. We consider a worst-case to determine the lower bound. Let $a_1, a_2, \ldots, a_{2n}$ be a sorted list, and $A = \langle a_1, a_3, \ldots, a_{2n-1}\rangle$, $B = \langle a_2, a_4, \ldots, a_{2n}\rangle$. By problem c, we know that $a_i$ must compare to $a_{i+1}$ in the merging process. So, there must be at least $2n-1$ comparisons.

2010年12月6日星期一

[Algo] Exercise 8.4

Exercise 8.4-2
What is the worst-case running time for the bucket-sort algorithm? What simple change to the algorithm preserves its linear expected running time and makes its worst-case running time $O(n \lg n)$?
solution:
The worst-case is that all elements are distributed in the same bucket, consequently, $O(n^2)$ running time is required to perform insertion sort. A simple change is to replace insertion sort with any $O(n \lg n)$ sorting algorithm, such as merge sort, quicksort, etc. to improve running time from $O(n^2)$ to $O(n \lg n)$.

Exercise 8.4-3
Let $X$ be a random variable that is equal to the number of heads in two flips of a fair coin. What is $E[X^2]$? What is $E^2[X]$?
solution:
$E[X^2] = 0 \times 1/4 + 1^2 \times 1/2 + 2^2 \times 1/4 = 1/2 + 1 = 3/2$
$E^2[X] = (0 \times 1/4 + 1 \times 1/2 + 2 \times 1/4)^2 = (1/2 + 1/2)^2 = 1$

Exercise 8.4-4
We are given n points in the unit circle, $p_i = (x_i, y_i)$, such that $0 < x_i^2 + y_i^2 \leq 1$ for $i = 1, 2, \ldots, n$. Suppose that the points are uniformly distributed; that is, the probability of finding a point in any region of the circle is proportional to the area of that region. Design a $\Theta(n)$ expected-time algorithm to sort the n points by their distances $d = \sqrt{x_i^2 + y_i^2}$ from the origin. (Hint: Design the bucket sizes in BUCKET-SORT to reflect the uniform distribution of the points in the unit circle.)
solution:
Note that the expected number of points residing in an area is proportional to the size of the area. To satisfy the assumption of BUCKET-SORT, we have to make the interval $(0, 1]$ split in a proper way. Consider a distance $d$ between $a$ and $b$, where $0 \leq a < d < b \leq 1$. The area is $\pi(b^2 - a^2)$. Dividing $(0, 1]$ into n equal-sized intervals does not help a lot. It does not mean we divide a unit circle into n areas evenly. To achieve our goal, we divide $(0, 1]$ into $<0, \sqrt{1/n}, \sqrt{2/n}, \ldots, \sqrt{n-1/n}, 1>$ intervals instead. In the context of such dividing, we can see that the $\pi(1/n) = \pi((\sqrt{2/n})^2 - (\sqrt{1/n})^2) = \cdots = \pi(1 - (\sqrt{n-1/n})^2)$. Because of the equal size of area, we'll expect that each bucket contain roughly the same points.

Exercise 8.4-5
A probability distribution function $P(x)$ for a random variable $X$ is defined by $P(x) = \mbox{Pr}\{X \leq x\}$. Suppose that a list of $n$ random variables $X_1, X_2, \ldots, X_n$ is drawn from a continuous probability function $P$ that is computable in $O(1)$ time. Show how to sort these numbers in linear expected time.
solution:
$P(x)$ is defined in a manner of cumulative distribution function. By probability integral transform, we know that $P(X_i)$ has a uniform distribution, where $i = 1, 2, \ldots, n$. As a result, we can sort these transformed numbers by BUCKET-SORT in linear time.

2010年9月27日星期一

[Algo] 8.4 Bucket sort

Basic idea
假設我們要排序的是 $[0,1)$ 之間的小數，而且假設這些小數都是 uniformly distributed。Bucket sort 就是把 $[0, 1)$ 分成 n 個長度相同的 intervals，可以看成是 n 個大小一樣的 buckets，然後根據 input 的小數，把他丟到對應的 bucket 裡面，最後每個 bucket 裡面的數字再去做排序，這樣就完成整個演算法了。因為我們假設 input 是 uniformly distributed 的，所以我們期待每個 bucket 裡面的 numbers 都是差不多的。

Algorithm

@{
\proc{Bucket-Sort}(A)
\li n = \textit{length}[A]
\li \For i = 1 \;\textbf{to}\; n
\li \;\;\;\;\textup{insert A[i] into list}\;B[\lfloor nA[i] \rfloor]
\li \For i = 0 \;\textbf{to}\; n - 1
\li \;\;\;\;\textup{sort list}\;B[i]\;\textup{with insertion sort}
\li \textup{concatenate the lists}\;B[0], B[1], \ldots, B[n - 1]\;\textup{together in order}
}@

Running time
整個 running time，可以由 $T(n)$ 來表示，$$T(n) = \Theta(n) + \sum_{i = 0}^{n-1}O({n_i}^2)$$，其中 $n_i$ 是每個 bucket 裡面所佔據的 element 的數量。對左右兩邊取期望值，我們可以得到：$$\mbox{E}[T(n)] = \mbox{E}[\Theta(n) + \sum_{i = 0}^{n - 1}O({n_i}^2)] = \Theta(n) + \sum_{i = 0}^{n - 1}\mbox{E}[O({n_i}^2)] = \Theta(n) + \sum_{i = 0}^{n - 1}O(\mbox{E}[{n_i}^2])$$
接下來我們要將火力集中在如何算 $$\mbox{E}[{n_i}^2]$$ 上面，我們首先定義一個 indicator random variable：$$X_{ij} = \mbox{I}\{A[j]~\mbox{falls in bucket}~i\}$$, for $i = 0, 1, \ldots , n - 1$ and $j = 1, 2, \ldots , n$，因此顯然 $$n_i = \sum_{j = 1}^{n}X_{ij}$$。接下來，我們開始求 $${n_i}^2$$ 的期望值。
$\begin{align*} \mbox{E}[{n_i}^2] &= \mbox{E}[(\sum_{j = 1}^n X_{ij})^2]\\ &= \mbox{E}[\sum_{j = 1}^n \sum_{k = 1}^n X_{ij}X_{ik}]\\ &= \mbox{E}[\sum_{j = 1}^n {X_{ij}}^2 + \sum_{1 \leq j \leq n}\sum_{1 \leq k \leq n, k\neq j}X_{ij}X_{ik}]\\ &= \sum_{j = 1}^n \mbox{E}[{X_{ij}}^2] +\sum_{1 \leq j \leq n}\sum_{1 \leq k \leq n, k\neq j} \mbox{E}[X_{ij}X_{ik}] \end{align*}$

其中，$$\mbox{E}[{X_{ij}}^2] = 1 \cdot \frac{1}{n} + 0 \cdot (1 - \frac{1}{n}) = \frac{1}{n}$$
當 $k \neq j$ 的時候，因為我們知道 $X_{ij}$ 跟 $X_{ik}$ 是獨立的，因此 $$ \mbox{E}[X_{ij}X_{ik}] = \mbox{E}[X_{ij}]\mbox{E}[X_{ik}] = \frac{1}{n}\cdot\frac{1}{n} = \frac{1}{n^2}$$
於是，我們可以把 $\mbox{E}[{n_i}^2]$ 的答案拼起來了。
$$\mbox{E}[{n_i}^2] = \sum_{j = 1}^n \frac{1}{n} +\sum_{1 \leq j \leq n}\sum_{1 \leq k \leq n, k\neq j} \frac{1}{n^2} = n \cdot \frac{1}{n} + n(n - 1) \cdot \frac{1}{n^2} = 2 - \frac{1}{n}$$
最後，我們得證
$\begin{align*} \mbox{E}[T(n)] &= \Theta(n)+\sum_{i=0}^{n-1}O(\mbox{E}[{n_i}^2])\\ &=\Theta(n)+\sum_{i=0}^{n-1}O(2-\frac{1}{n})\\ &=\Theta(n)+n \cdot O(2-\frac{1}{n})\\ &=\Theta(n) \end{align*}$
整個 bucket sort 的 expected running time 是線性的。

2010年9月19日星期日

[Algo] Exercise 8.3

Exercise 8.3-2
Which of the following sorting algorithm are stable: insertion sort, merge sort, heapsort, and quicksort? Give a simple scheme that makes any sorting algorithm stable. How much additional time and space does your scheme entail?
solution:
(1) insertion sort, merge sort are stable.
(2) We store the index value of each element, and compare the index value if there is a tie. Thus, the additional time requirement will not be affected asymptotically due to adding constant work only during comparison. The additional space requirement is that we require $\lg n$ bits to store index value resulting in $O(n\lg n)$ bits additional space.

Exercise 8.3-3
Use induction to prove that radix sort works. Where does your proof need the assumption that the intermediate sort is stable?
solution:
Basis: If $d = 1$, sorting on that digit sorts the array correctly.
Inductive step: Assume that RADIX-SORT sorts $d-1$ digits correctly. Consider two elements $a$ and $b$, with their $d$th digit $a_d$ and $b_d$ respectively.
(1) $a_d > b_d$ and $b_d > a_d$ : RADIX-SORT works correctly, because of most significant bit dominates regardless of the lower $d-1$ digits.
(2) $a_d = b_d$ : RADIX-SORT leaves $a$ and $b$ in the same order because it is stable sort. The order is correct since lower $d-1$ digits sorts correctly. That's why we need that the intermediate sort must be stable.

Exercise 8.3-4
Show how to sort n integers in the range 0 to $n^2 - 1$ in $O(n)$ time.
solution:
Using Lemma 8.4, we know that $k = n^2$, $b = \lceil 2\lg n\rceil$ and we choose $r = \lfloor \lg n\rfloor$, then RADIX-SORT can sort these numbers in $O((2\lg n / \lg n)(n + 2^{\lg n})) = O(2(n + n)) = O(n)$ running time.

Exercise 8.3-5
In the first card-sorting algorithm in this section, exactly how many sorting passes are needed to sort d-digit decimal numbers in the worst cas? How many piles of cards would an operator need to keep track of in the worst case?
solution:
There are $$1 + 10 + 10^2 + \cdots +10^{d-1} = \frac{10^d - 1}{10 - 1} = \frac{10^d - 1}{9}$$ passes needed to sort all numbers.
Each time we recursively sort numbers in a bin, we must keep track of other 9 bins. And we have to recursively sort the numbers for each digit. Hence we have to keep track of $9d$ piles of cards in the worst case.

[Algo] 8.3 Radix sort

Radix sort 的原理

假設 input array 裡的每個 element 都是 d-digit，整個 radix sort 的精神就是從 least significant bit 開始往 most significant bit 排序，在這邊我們用來排序每個 bit 的是用有 stable 性質的 sorting algorithm (counting sort)，經過 d 個 pass 之後，整個演算法就大功告成了，注意不是從 most significant bit 開始喔！

Radix sort algorithm

@{
\proc{Radix-Sort}(A, d)
\li \For i = 1 \;\textbf{to}\; d
\li \;\;\;\;\textup{use a stable sort to sort array}\; A \;\textup{on digit}\; i
}@

Lemma 8.3
Given $n$ $d$-digit numbers in which each digit can take on up to $k$ possible values, RADIX-SORT correctly sorts these numbers in $\Theta(d(n + k))$ time.
proof: 當每個 digit 的值域 k 不要太大，COUNTING-SORT是個不錯的選擇。每個 pass 裡，我們都花 $\Theta(n + k)$ 的時間來排序 $n$ 個 digit。總共有 $d$ 個 passes，所以總共需要 $\Theta(d(n + k))$ 的 running time。$\blacksquare$

當 $d$ 是常數，而 $k = O(n)$，RADIX-SORT 就是 linear time 的 sorting algorithm。

Lemma 8.4
Given $n$ $b$-bit numbers and any positive integer $r \leq b$, RADIX-SORT correctly sorts these numbers in $\Theta((b/r)(n + 2^r))$ time.
proof: 在 RADIX-SORT 裡面，我們每次排序都只針對一個 digit 去做 stable sort，也就是它的值域只有 0 跟 1 而已，其實也可以把 $r$ 個 digit 看成一個 pass 來做 stable sort，變成每個 pass 所作 sorting 的值域變成 $2^r - 1$，這樣我們只要 $\lceil b/r \rceil$ 次 pass 就可以完成了。
於是整個 running time 就是 $\Theta(d(n + k)) = \Theta(\lceil b/r \rceil(n + 2^r - 1)) = \Theta((b/r)(n + 2^r)) \blacksquare$

如何選擇適當的 r
(1) $b < \lfloor \lg n \rfloor$：此時 $n + 2^r = \Theta(n)$，於是令 $r = b$ 時，running time 是 $(b/b)(n + 2^b) = \Theta(n)$，這樣就會是 asymptotically optimal 的了。
(2) $b \geq \lfloor \lg n \rfloor$：此時選擇 $r= \lfloor \lg n \rfloor$，整個 running time 就是 $\Theta((b/\lg n)(n + n)) = \Theta(bn/ \lg n)$

[Algo] Exercise 8.2

Exercise 8.2-2
Prove that COUNTING-SORT is stable.
solution: see solution 8.2-3.

Exercise 8.2-3
Suppose that the for loop header in line 9 of the COUNTING-SORT procedure is rewritten as
9 for j = 1 to length[A]
Show that the algorithm still works properly. Is the modified algorithm stable?
solution:
The modified algorithm is not stable, though it works properly. No matter how j runs from 1 or from length[A], the correctness of the algorithm will not be affected. Because C[0..k] is well established. When index j is running from 1 to length[A], the element which is taken from A earlier is put in B at later index. It violates the stable property.

Exercise 8.2-4
Describe an algorithm that, given $n$ integers in the range 0 to $k$, preprocesses its input and then answers any query about how many of the n integers fall into a range [a..b] in $O(1)$ time. Your algorithm should use $\Theta(n + k)$ preprocessing time.
solution:
We can design an algorithm that is according to the COUNTING-SORT algorithm. Then the answer to the query is determined by $C[b] - C[a-1]$ in $O(1)$ running time. (We set $C[-1] = 0$.)

[Algo] 8.2 Counting sort

Counting sort 的基本假設
每個 input 都是 0 到 k 的整數

Counting sort 的原理
對每一個 input element $x$，我們要決定的就是有多少 elements 是比 $x$ 小的，一旦決定了之後，我們就知道這個 $x$ 的順位是多少，於是就可以直接塞到它的 position。

Algorithm
@{
\proc{Counting-Sort}(A, B, k)
\li \For i = 0 \;\textbf{to}\; k
\li \;\;\;\;C[i] = 0
\li \For j = 1 \;\textbf{to}\; \textit{length}[A]
\li \;\;\;\;C[A[j]] = C[A[j]] + 1
\li \textup{// C[i] now contains the number of elements equal to i}
\li \For i = 1 \;\textbf{to}\; k
\li \;\;\;\;C[i] = C[i] + C[i - 1]
\li \textup{// C[i] now contains the number of elements less than pr equal to i}
\li \For j = \textit{length}[A] \;\textbf{downto}\; 1
\li \;\;\;\;B[C[A[j]]] = A[j]
\li \;\;\;\;C[A[j]] = C[A[j]] - 1
}@

A[1..n] 是 input array，其中 $A[j] \in \{0, 1, \ldots, k\}$ for $j = 1, 2, \ldots, n$
B[1..n] 是 output array
C[0..k] 是 auxiliary array，提供暫時的變數儲存空間

第 1-2 行：先把 C[1..k] 初始化
第 3-4 行：C[j] 裡面儲存 A[1..n] 裡面有多少 element 是 $j$
第 6-7 行：C[j] 裡面儲存 A[1..n] 裡面有多少 element 是比小於等於 $j$ 的
第 9-11行：對 $j \in \{0, 1,\ldots, k\}$ 而言，因為我們知道有多少 element 是比 $j$ 小的，也就是說我們知道它的大小順序是多少，於是我們直接把它塞到 output array B[1..n] 裡面

Counting sort's running time
The overall running time is $\Theta(n + k)$，which is $\Theta(n)$ if $k = O(n)$.

stable property
stable: keys with same value appear in same order in output as they did in input
因為 counting sort 的第 9-11 行裡面的 index j 是從後面往前數到 1，這樣可以保有 counting sort 的 stable 特性。這個特性在有 satellite data 也跟著 key 一起排序的時候特別重要。

2010年9月18日星期六

[Algo] Exercise 8.1

Exercise 8.1-1
What is the smallest possible depth of a leaf in a decision tree for a comparison sort?
solution:
At least each element in $\langle a_1, a_2, \ldots , a_n\rangle$ has to be compared. Hence the smallest depth is n-1.

Exercise 8.1-2
Obtain asymptotically tight bounds on $\lg(n!)$ without using Stirling's approximation. Instead, evaluate the summation $\sum_{k=1}^n \lg k$ using technique from Section A.2.
solution:
Since $\langle \lg k \rangle$ is an increasing sequence, we can evaluate the summation by integral
$$\int_{m-1}^n f(x)dx \leq \sum_{k = m}^n f(k) \leq \int_{m}^{n+1}f(x)dx.$$
$\begin{align*} &\int_1^n \lg x dx \leq \sum_{k=2}^n \lg k \leq \int_{2}^{n+1} \lg x dx\\ &\Rightarrow x \lg x - x|^n_1 \leq\sum_{k=2}^n \lg k \leq x \lg x - x|^{n+1}_2\\ &\Rightarrow n \lg n - n +1 \leq\sum_{k=2}^n \lg k \leq (n+1)\lg(n+1) - n - 1\\ &\Rightarrow \sum_{k=1}^n \lg k = \Theta(n \lg n) \end{align*}$

Exercise 8.1-3
Show that there is no comparison sort whose running time is linear for at least half of the $n!$ inputs of length n. What about a fraction of $1/n$ of the inputs of length n? What about a fraction $1/2^n$?
solution:
Suppose that there exists a comparison sort which sorts $m$ inputs of $n!$ permutations in linear time. Suppose that the height of the portion of the decision tree consisting of $m$ corresponding leaves is $h$. We have $m \leq 2^h$, which gives us $h \geq \lg m$. By Stirling's approximation:
$$n! = \sqrt{2\pi n}(\frac{n}{e})^n(1 + \Theta(\frac{1}{n})),$$
we have
$\begin{align*} &\lg\frac{n!}{2} = \lg(n!) - 1 \geq n\lg n - n\lg e - 1\\ &\lg\frac{n!}{n} = \lg(n!) - \lg n \geq n\lg n - n\lg e - \lg n\\ &\lg\frac{n!}{2^n} = \lg(n!) - n \geq n\lg n - n\lg e - n \end{align*}$
Hence, for $m = n!/2, n!/n, n!/2^n$, $h = \Omega(n\lg n)$. It is impossible for a comparison sort whose running time is linear for $m =n!/2, n!/n, n!/2^n$ of the $n!$ inputs of length n.

Exercise 8.1-4
You are given a sequence of $n$ elements to sort. The input sequence consists of $n/k$ subsequences, each containing $k$ elements. The elements in a given subsequence are all smaller than the elements in the succeeding subsequence and larger than the elements in the preceding subsequence. Thus, all that is needed to sort the whole sequence of length of n is to sort the $k$ elements in each of the $n/k$ subsequences. Show an $\Omega(n \lg k)$ lower bound on the number of comparisons needed to solve this variant of the sorting problem. (Hint: It is not rigorous to simply combine the lower bounds for the individual subsequences.)
solution:
There are only $(k!)^{n/k}$ reachable leaves in such decision tree. Suppose that the height of the decision tree is $h$. We have $2^h \geq (k!)^{n/k}$, which gives us $h \geq (n/k)\lg(k!) \geq (n/k)(k\lg k - k\lg e) = n\lg k - n\lg e = \Omega(n\lg k)$.

2010年9月17日星期五

[Algo] 8.1 Lower bounds for sorting

Comparison sorts
merge sort, heapsort, ... 以及我們之前所看過的許多 sorting algorithm，都有一個共同的特色：the sorted order they determine is based on comparisons between the input elements。這些 sorting algorithm 因此被稱為 comparison sorts。往下我們將會證明任何的 comparison sort 的 worst case 一定得花費 $\Omega(n \lg n)$ 這麼多的時間。

在這個 section，我們將會假設 input elements 都是相異的，沒有任兩個 elements 是相同的。根據這個假設，我們也假設所有 comparison 的方法都是這樣的形式： $a_i \leq a_j$。這也是唯一可以獲得兩個 elements 他們之間的資訊的唯一 operation。

The decision-tree model
A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size.
decision tree model 是所有 comparison sort 的 abstraction。一旦給定了某個 sorting algorithm 以及某個 input 的長度，那這棵 decision tree 也就跟著確定了。

上面那張圖是個 insertion sort 排序 3 個 elements 的 input array 的 decision tree，每個 non-leaf node 以 $i:j$ 表示 $a_i$ 跟 $a_j$ 在做 comparison，其中的 $i:j$ 是指他們的 original position，我們忽略 data movement 的影響。每個 leaf 都會標示 $\langle \pi(1), \pi(2), \ldots , \pi(n) \rangle$，以代表 $a_{\pi(1)} \leq a_{\pi(2)} \leq \ldots \leq a_{\pi(n)}$ 的排序結果。對一個長度為 n 的 array 執行 sorting algorithm 就對應到一條從 root 往下 trace 到某個 leaf 的 path，不同的 input 就對應到不同的 path。

因為長度為 n 的 input array，總共有 $n!$ 種的排列組合，而且每一個排列組合都必須對應到一個 leaf，所以我們知道這棵 decision tree 的 leaves 必須大於 $n !$ 個

A lower bound for the worst case

首先，我們知道任何 height 為 h 的 binary tree 的 leaves 最多就是 $2^h$ 個。然後我們必須瞭解到最長的 path 對應到這個 sorting algorithm 的 worst-case，接下來我們引進一個定理：

Theorem 8.1

Any comparison sort algorithm requires $\Omega(n \lg n)$ comparisons in the worst case.

Proof: 假設有一棵 height 為 h 的 decision tree 有 l 個 reachable leaves，因為我們知道$n!$ 種的排列組合都必須對應到一個 leaf，所以我們可以得到 $n! \leq l$，再來因為任何 height 為 h 的 binary tree 的 leaves 最多就是 $2^h$ 個，所以我們又可以得到 $n! \leq l \leq 2^h$，因此我們知道 $h \geq \lg(n!) = \Omega(n \lg n)$。$\blacksquare$

Corollary 8.2

Heapsort and merge sort are asymptotically optimal comparison sorts.

Proof: 前面我們已經知道 heapsort 以及 merge sort 有 $O(n \lg n)$ 的 upper bound，根據 Theorem 8.1，我們知道他們都有 $\Omega(n \lg n)$ 的 lower bound，故得證。$\blacksquare$

2010年8月28日星期六

[Algo] Problem 7

Problem 7-1 Hoare partition correctness
The version of PARTITION given in this chapter is not the original partitioning algorithm. Here is the original partition algorithm, which is due to C.A.R. Hoare:

@{
\proc{Hoare-Partition}(A, p, r)
\li x = A[p]
\li i = p - 1
\li j = r + 1
\li \While \textbf{true}
\li \;\;\;\;\textbf{repeat}\;j = j - 1
\li \;\;\;\;\textbf{until}\;A[j] \leq x
\li \;\;\;\;\textbf{repeat}\;i = i + 1
\li \;\;\;\;\textbf{until}\;A[i] \geq x
\li \;\;\;\;\If i < j
\li \;\;\;\;\;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \;\;\;\;\Else
\li \;\;\;\;\;\;\;\;\Return j
}@

a. Demonstrate the operation of HOARE-PARTITION on the array A = {13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21}, showing the values of the array and auxiliary values after each iteration of the for loop in lines 4-11.

The next three questions ask you to give a careful argument that the procedure HOARE-PARTITION is correct. Prove the following:

b. The indices i and j are such that we never access an element of A outside the subarray $A[p .. r]$.
c. When HOARE-PARTITION terminates, it returns a value j such that $p \leq j < r$.
d. Every element of A[p .. j] is less than or equal to every element of A[j+1 .. r] when HOARE-PARTITION terminates.

The PARTITION procedure in section 7.1 seperates the pivot value (originally in A[r]) from the two partitions it forms. The HOARE-PARTITION procedure, on the other hand, always places the pivot value (originally in A[p]) into one of the two partitions A[p .. j] and A[j+1 .. r]. Since $p \leq j < r$, this split is always nontrivial.

e. Rewrite the QUICKSORT procedure to use HOARE-PARTITION.
solution:
b., c., d. After the first iteration, $p = i \leq j \leq r$. If the process does not terminate, the index j must decrease such that $j < r$ holds. So at the beginning of kth iteration, we have $p = i \leq j < r$ and $A[p .. i] \leq x$, $A[j .. r] \geq x$. The index j can not run over p because of $A[p .. i] \leq x$, on the other hand, the index i can not run over r either because of $A[j .. r] \geq x$.
e. The algorithm is straightforward.
@{
\proc{Quicksort}(A, p, r)
\li \If p < r
\li \;\;\;\;q = \proc{Hoare-Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}(A, p, q)
\li \;\;\;\;\proc{Quicksort}(A, q + 1, r)
}@

Problem 7-2 Alternative quicksort analysis
An alternative analysis of the running time of randomized quicksort focuses on the expected running time of each individual recursive call to QUICKSORT, rather than on the number of comparisons performed.
a. Argue that, given an array of size n, the probability that any particular element is chosen as the pivot is $1/n$. Use this to define indicator random variables $X_i$ = I{ith smallest element is chosen as the pivot}. What is $E[X_i]$?
b. Let $T(n)$ be a random variable denoting the running time of quicksort on an array of size n. Argue that
$$E[T(n)] = E[\sum_{q = 1}^n X_q (T(q) + T(n - q) + \Theta(n))]\quad (7.5).$$
c. Show that equation (7.5) can be rewritten as
$$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n-1}E[T(q)] + \Theta(n) \quad (7.6).$$
d. Show that
$$\sum_{k = 2}^{n-1} k \lg k \leq \frac{1}{2}n^2 \lg n - \frac{1}{8}n^2 \quad (7.7).$$
(Hint: Split the summation into two parts, one for $k = 2, 3, \ldots , \lceil n/2 \rceil - 1$ and one for $k = \lceil n/2\rceil , \ldots , n - 1$.)
e. Using the bound from equation (7.7), show that the recurrence in equation (7.6) has the solution $E[T(n)] = \Theta(n \lg n)$. (Hint: Show, by substituting, that $E[T(n)] \leq an \lg n$ for sufficiently large n and for some positive constant a.)
solution:
a. The answer is trivial. $E[X_i] = 1/n$.
b. According to equation (7.1), we have $T(n) = T(q) + T(n - q - 1) + \Theta(n)$, where $0 \leq q \leq n - 1$. Let $q^\prime = q + 1$, we can obtain $T(n) = T(q^\prime - 1) + T(n - q^\prime) + \Theta(n)$. By making use of indicator random variable, $$T(n) = \sum_{q^\prime = 1}^{n}X_{q^\prime}(T(q^\prime) + T(n - q^\prime) + \Theta(n)).$$
c. Note that indicator random variables $X_1, X_2, \ldots , X_n$ and random variables $T(1), T(2), \ldots , T(n)$ are mutually independent.
$\begin{align*} E[T(n)] &= E[\sum_{q = 1}^{n}X_{q}(T(q) + T(n - q) + \Theta(n))]\\ &= \frac{1}{n}\sum_{q = 1}^{n}(E[T(q - 1)] + E[T(n - q)] + \Theta(n))\\ &= \frac{2}{n}\sum_{q = 1}^n E[T(q - 1)] + \Theta(n) \end{align*}$
Note that $T(0) = T(1) = 0$, so that we can rewrite to $$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n - 1} E[T(q)] + \Theta(n).$$
d.
$\begin{align*} \sum_{k = 2}^{n - 1} k \lg k &= \sum_{k = 2}^{\lceil n/2 \rceil - 1} k \lg k + \sum_{k = \lceil n/2 \rceil}^{n - 1} k \lg k\\ &\leq (2 \lg \frac{n}{2} + 3 \lg \frac{n}{2} + \cdots + (\lceil \frac{n}{2} \rceil - 1) \lg \frac{n}{2}) + (\lceil \frac{n}{2}\rceil \lg n + \cdots + (n - 1) \lg n)\\ &= \frac{1}{2}(\lceil \frac{n}{2} \rceil - 1 + 2)(\lceil \frac{n}{2} \rceil - 2)\lg \frac{n}{2} + \frac{1}{2}(\lceil \frac{n}{2} \rceil + n - 1)(n - \lceil \frac{n}{2} \rceil)\lg n\\ &\leq \frac{1}{2}\lceil \frac{n}{2} \rceil^2 \lg \frac{n}{2} + \frac{1}{2}(n^2 - \lceil \frac{n}{2} \rceil^2) \lg n\\ &= \frac{1}{2}\lceil \frac{n}{2} \rceil^2(\lg n - 1) + \frac{1}{2}(n^2 - \lceil \frac{n}{2} \rceil^2)\lg n\\ &= \frac{1}{2}n^2 \lg n - \frac{1}{2}\lceil \frac{n}{2} \rceil^2 \leq \frac{1}{2}n^2 \lg n - \frac{1}{8}n^2 \end{align*}$
e. Assume that $E[T(n)] \leq an \lg n$, and substituing into equation (7.6).
$\begin{align*} E[T(n)] &= \frac{2}{n}\sum_{q = 2}^{n-1}E[T(q)] + \Theta(n)\\ &= \frac{2}{n}\sum_{q = 2}^{n - 1} aq \lg q + \Theta(n)\\ &\leq \frac{2a}{n}(\frac{1}{2}n^2 \lg n - \frac{1}{8}n^2) + \Theta(n)\\ &= an \lg n - \frac{an}{4} + \Theta(n) \end{align*}$
Since we can always find an appropriate constant which can dominate $\Theta(n)$ term, we can get $E[T(n)] = O(n \lg n)$. By exercise 7.4-2, we can conclude that $E[T(n)] = \Theta(n \lg n)$.

Problem 7-3 Stooge sort
Professor Howard, Fine, and Howard have proposed the following "elegant" sorting algorithm:

@{
\proc{Stooge-Sort}(A, i, j)
\li \If A[i] > A[j]
\li \;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \If i + 1 \geq j
\li \;\;\;\;\Return
\li k = \lceil(j - i + 1) / 3\rceil
\li \proc{Stooge-Sort}(A, i, j - k)
\li \proc{Stooge-Sort}(A, i + j, j)
\li \proc{Stooge-Sort}(A, i, j - k)
}@

a. Argue that, if $n = length[A]$, then STOOGE-SORT(A, 1, $length[A]$) correctly sorts the input array $A[1 .. n]$.
b. Give a recurrence for the worst-case running time of STOOGE-SORT and a tight asymptotic ($\Theta$-notation) bound on the worst-case running time.
c. Compare the worst-case running time of STOOGE-SORT with that of insertion sort, merge sort, heapsort, and quicksort. Do the professors deserve tenure?
solution:
b. By the algorithm proposed, we have the recurrence:
$$T(n) = 3T(\frac{2n}{3}) + \Theta(1).$$
Using master theorem, we get
$$T(n) = \Theta(n^{\log_{\frac{3}{2}}3}) \approx \Theta(n^{2.71}).$$
c. It does worse than insertion sort, merge sort, heapsort, and quicksort.

Problem 7-4 Stack depth for quicksort
The QUICKSORT algorithm of Section 7.1 contains two recursive calls to itself. After the call to PARTITION, the left subarray is recursively sorted and then the right subarray is recursively sorted. The second recursive call in QUICKSORT is not really necessary; it can be avoided by using an iterative control structure. This technique, called tail recursion, is provided automatically by good compilers. Consider the following version of quicksort, which simulates tail recursion.

@{
\proc{Quicksort}^{'}(A, p, r)
\li \While p < r
\li \;\;\;\;\textup{// partition and sort left subarray}
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}^{'}(A, p, r)
\li \;\;\;\;p = q + 1
}@

a. Argue that QUICKSORT'(A, 1, $length[A]$) correctly sorts the array A.

Compilers usually execute recursive procedures by using a stack that contains pertinent information, including the parameter values, for each recursive call. The information for the most recent call is at the top of the stack, and the information for the initial call is at the bottom. When a procedure is invoked, its information is pushed onto the stack; when it terminates, its information is poped. Since we assume that array parameters are represented by pointers, the information for each procedure call on the stack requires $O(1)$ stack space. The stack depth is the maximum amount of stack space used at any time during a computation.

b. Describe a scenario in which the stack depth of QUICKSORT' is $\Theta(n)$ on an n-element input array.
c. Modify the code for QUICKSORT' so that the worst-case stack depth is $\Theta(\lg n)$. Maintain the $O(n \lg n)$ expected running time of the algorithm.
solution:
b. When the input array is sorted, it will split into two subarrays. One is of size (n-1), and the other is of size 1. In such situation the stack depth is of size $\Theta(n)$.

QUICKSORT(A, p, r)

QUICKSORT(A, p, r-1)

    $\vdots$

QUICKSORT(A, p, p+1)
c. The problem demonstrated by above scenario is that each invocation of QUICKSORT' calls QUICKSORT' again with almost the same range. If we call QUICKSORT' on smaller subarray returned by PARTITION, we will avoid such problem. Furthermore, the smaller subarray is at most half of the size, we still have $\Theta(\lg n)$ stack depth.

@{
\proc{Quicksort}^{''}(A, p, r)
\li \While p < r
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\If q - p < r - q
\li \;\;\;\;\;\;\;\;\proc{Quicksort}^{''}(A, p, q - 1)
\li \;\;\;\;\;\;\;\;p =  q + 1
\li \;\;\;\;\Else \proc{Quicksort}^{''}(A, q + 1, r)
\li \;\;\;\;\;\;\;\;r = q - 1
}@

Problem 7-5 Median-of-3 partition
One way to improve the RANDOMIZED-QUICKSORT procedure is to partition around a pivot that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose the pivot as the median (middle element) of a set of 3 elements randomly selected from the subarray. (See exercise 7.4-6.) For this problem, let us assume that the elements in the input array $A[1 .. n]$ are distinct and that $n \geq 3$. We denote the sorted output array by $A^{\prime}[1 .. n]$. Using the median-of-3 method to choose the pivot element $x$ define $p_i = \mbox{Pr}\{x = A^{\prime}[i]\}$.

a. Give an exact formula for $p_i$ as a function of n and i for $i = 2, 3, \ldots , n - 1$. (Note that $p_1 = p_n = 0$.)
b. By what amount have we increased the likelihood of choosing the pivot as $x = A^{\prime}[\lfloor (n+1)/2 \rfloor]$, the median of $A[1 .. n]$, compared to the ordinary implementation? Assume that $n \rightarrow \infty$, and give the limiting ratio of these probabilities.
c. If we define a "good" split to mean choosing the pivot as $x = A^{\prime}[i]$, where $n/3 \leq i \leq 2n/3$, by what amount have we increased the likelihood of getting a good split compared to the ordinary implementation? (Hint: Approximate the sum by an integral.)
d. Argue that in the $\Omega(n \lg n)$ running time of quicksort, the median-of-3 method affects only the constant factor.

solution:

a. The answer is straightforward. $p_i = \frac{(i-1)(n-i)}{C_3^n}$

b. When $i = \lfloor (n + 1)/2 \rfloor$, we get $p_{\lfloor (n + 1)/2 \rfloor} = \frac{(\lfloor (n + 1)/2 \rfloor - 1)(n - \lfloor (n + 1)/2 \rfloor)}{C_3^n}$.

If i is odd,

$$p_i = \frac{(\frac{n-1}{2})^2}{\frac{n(n-1)(n-2)}{6}} = \frac{3(n-1)}{2n(n - 2)},$$

and the ratio is

$$\frac{\frac{3(n-1)}{2n(n-2)}}{\frac{1}{n}} = \frac{3(n-1)}{2(n-2)}.$$

Otherwise,

$$p_i = \frac{(\frac{n}{2} - 1)(n - \frac{n}{2})}{\frac{n(n-1)(n-2)}{6}} = \frac{3}{2(n - 1)},$$

and the ratio is

$$\frac{\frac{3}{2(n-1)}}{\frac{1}{n}}.$$

Assume that n approaches infinity, the limting ratio is $1.5$ no matter what n is.

c. By ordinary implementation, the probability of a good split is $1/3$. In median-of-3 implementation, the probability of a good split is

$\begin{align*} \sum_{i = \lceil n/3 \rceil}^{\lfloor 2n/3 \rfloor}p_i &\approx 1 - 2\sum_{i = 2}^{n/3} p_i\\ &= 1 - 2\sum_{i = 2}^{n/3}\frac{6(i-1)(n-i)}{n(n-1)(n-2)}\\ &= 1 - \frac{12}{n(n-1)(n-2)}\sum_{i = 2}^{n/3}(i-1)(n-i) \end{align*}$

Approximate the summation part by integral, let $t = i-1$, $\int_{1}^{n/3-1}t(n-t-1) dt = \frac{1}{2}nt^2 - \frac{1}{3}t^3 - \frac{1}{2}t^2|^{n/3-1}_{1} = \frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}$.

So, $1-\frac{12}{n(n-1)(n-2)}(\frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}) = 1 + \frac{-\frac{14}{27}n^3 + \frac{10}{3}n^2 - \frac{8}{3}n - 8}{n(n-1)(n-2)}$. When n approaches infinity, the probability is approaching $1 - \frac{14}{27} = \frac{13}{27}$.

d. Even though we always pick the median of the input array as the pivot during the recursion call, the best running time is still $\Omega(n \lg n)$. So the median-of-3 version of quicksort cannot affect the asymptotic running time, it only affect the constant factor.

Problem 7-6 Fuzzy sorting of intervals
Consider a sorting problem in which the numbers are not known exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given n closed intervals of the form $[a_i, b_i]$, where $a_i \leq b_i$. The goal is to fuzzy-sort these intervals, i.e., produce a permutation $\langle i_1, i_2, \ldots , i_n \rangle$ of the intervals such that there exist $c_j \in [a_{i_j}, b_{i_j}]$, satisfying $c_1 \leq c_2 \leq \cdots \leq c_n$.

a. Design an algorithm of fuzzy-sorting n intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the $a_i$s), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals gets easier and easier. Your algorithm should take advantage of such overlapping, to the extent that it exists.)
b. Argue that your algorithm runs in expected time $\Theta(n \lg n)$ in general, but runs in expected time $\Theta(n)$ when all of the intervals overlap (i.e., when there exists a value x such that $x \in [a_i, b_i]$ for all i.) Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.

solution:
a. & b. reference1, reference2

Note that the overlapping intervals do not require sorting. Since for two overlapping intervals $I_i$, $I_j$, we can always find two constants $c_i$, $c_j$ such that $c_i \geq c_j$ or $c_j \geq c_i$. If n intervals overlap, any of $n!$ permutations is a valid fuzzy sorting result. That's why we can speed up the overall running time if more and more intervals overlap.

Suppose that we have n closed intervals denoted by $I = \{[a_1, b_1], [a_2, b_2], \cdots, [a_n, b_n]\}$. We give the following algorithms to solve the fuzzy-sorting problem:

$\textsc{Fuzzy-Sort}(I, p, r)\\ 1\;\;\;\; \textbf{if}\;\;p < r\\ 2\;\;\;\;\;\;\;\;[q_l, q_r] = \textsc{Fuzzy-Partition}(I, p, r)\\ 3\;\;\;\;\;\;\;\;\textsc{Fuzzy-Sort}(I, p, q_l - 1)\\ 4\;\;\;\;\;\;\;\;\textsc{Fuzzy-Sort}(I, q_r + 1, r)$

$\tiny \textsc{Fuzzy-Partition}(I, p, r)\\ 1\;\;\;\;x = a_r\;\;\;\;\textup{// }\;a_r\;\textup{is the left endpoint of}\;I[r]\\ 2\;\;\;\;i = p - 1\\ 3\;\;\;\;\textbf{for}\; j = p \;\textbf{to}\;(r - 1)\\ 4\;\;\;\;\;\;\;\;\textbf{if}\; a_j \leq x\\ 5\;\;\;\;\;\;\;\;\;\;\;\;i = i + 1\\ 6\;\;\;\;\;\;\;\;\;\;\;\;\textup{exchange}\;I[i]\;\textup{and}\;I[j]\\ 7\;\;\;\;\textup{exchange}\;I[i + 1]\;\textup{and}\;I[r]\\ 8\;\;\;\;q = p - 1\\ 9\;\;\;\;\textbf{for}\;k = p \;\textbf{to}\;i\\ 10\;\;\;\;\;\;\textbf{if}\; b_k \geq x\\ 11\;\;\;\;\;\;\;\;\;\;q = q + 1\\ 12\;\;\;\;\;\;\;\;\;\;\textup{exchange}\;I[k]\;\textup{and}\;I[q]\\ 13\;\;\textbf{for}\; l = p \;\textbf{to}\;q\\ 14\;\;\;\;\;\;\textup{exchange}\;I[l]\;\textup{and}\;I[i - l + p]\\ 15\;\;\textbf{return}\; (i - q + p - 1, i + 2)$

We adopt a strategy similar to quicksort. We pick an interval as our pivot, and partition the intervals into 3 groups, that is, left group, middle group, and right group. The middle group is intervals which have intersection with the pivot interval. The left group is intervals which are smaller than the pivot interval and have no intersection with the pivot interval. The right group is similar to left group but larger than pivot interval.

In line 1-7, it is similar to PARTITION. At the end of line 7, we have the property that $a_k \leq a_{i+1}$ for $k = p, \ldots , i$ and $a_k > a_{i+1}$ for $k = i+2, \ldots , r$. In line 8-12, we pick up those intervals which not only is smaller than the pivot interval but also overlap the pivot interval. At the end of line 12, we know that $I[p], \cdots , I[q]$ overlap the pivot interval and $I[q+1], \cdots , I[i]$ smaller than pivot interval but have no intersection. In line 13-14, we make $I[p, \ldots , q]$ come closer to the pivot interval by swap $I[p]$ with I[i] and $I[p+1]$ with $I[i-1]$ and so on. In line 16-20, we can also seperate the right group. Finally, the left group is denoted by $I[p, \ldots , q_l - 1]$, and the right group is denoted by $I[q_r + 1, \ldots , r]$.

The running time of subroutine FUZZY-PARTITION is clearly $\Theta(n)$. Hence, the total running time of FUZZY-SORT is $\Theta(n \lg n)$. As more and more intervals overlap, the left group and right group will have fewer and fewer intervals. So the running time will speed up. If all intervals overlap, then the running time is $\Theta(n)$.

訂閱：文章 (Atom)

2010年12月29日 星期三

2010年12月28日 星期二

2010年12月24日 星期五

2010年12月6日 星期一

2010年9月27日 星期一

2010年9月19日 星期日

2010年9月18日 星期六

2010年9月17日 星期五

2010年8月28日 星期六