2010年8月28日 星期六

[Algo] Problem 7

Problem 7-1 Hoare partition correctness
The version of PARTITION given in this chapter is not the original partitioning algorithm. Here is the original partition algorithm, which is due to C.A.R. Hoare:
@{
\proc{Hoare-Partition}(A, p, r)
\li x = A[p]
\li i = p - 1
\li j = r + 1
\li \While \textbf{true}
\li \;\;\;\;\textbf{repeat}\;j = j - 1
\li \;\;\;\;\textbf{until}\;A[j] \leq x
\li \;\;\;\;\textbf{repeat}\;i = i + 1
\li \;\;\;\;\textbf{until}\;A[i] \geq x
\li \;\;\;\;\If i < j
\li \;\;\;\;\;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \;\;\;\;\Else
\li \;\;\;\;\;\;\;\;\Return j
}@

a. Demonstrate the operation of HOARE-PARTITION on the array A = {13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21}, showing the values of the array and auxiliary values after each iteration of the for loop in lines 4-11.

The next three questions ask you to give a careful argument that the procedure HOARE-PARTITION is correct. Prove the following:

b. The indices i and j are such that we never access an element of A outside the subarray $A[p .. r]$.
c. When HOARE-PARTITION terminates, it returns a value j such that $p \leq j < r$.
d. Every element of A[p .. j] is less than or equal to every element of A[j+1 .. r] when HOARE-PARTITION terminates.

The PARTITION procedure in section 7.1 seperates the pivot value (originally in A[r]) from the two partitions it forms. The HOARE-PARTITION procedure, on the other hand, always places the pivot value (originally in A[p]) into one of the two partitions A[p .. j] and A[j+1 .. r]. Since $p \leq j < r$, this split is always nontrivial.

e. Rewrite the QUICKSORT procedure to use HOARE-PARTITION.
solution:
b., c., d. After the first iteration, $p = i \leq j \leq r$. If the process does not terminate, the index j must decrease such that $j < r$ holds. So at the beginning of kth iteration, we have $p = i \leq j < r$ and $A[p .. i] \leq x$, $A[j .. r] \geq x$. The index j can not run over p because of $A[p .. i] \leq x$, on the other hand, the index i can not run over r either because of $A[j .. r] \geq x$.
e. The algorithm is straightforward.
@{
\proc{Quicksort}(A, p, r)
\li \If p < r
\li \;\;\;\;q = \proc{Hoare-Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}(A, p, q)
\li \;\;\;\;\proc{Quicksort}(A, q + 1, r)
}@


Problem 7-2 Alternative quicksort analysis
An alternative analysis of the running time of randomized quicksort focuses on the expected running time of each individual recursive call to QUICKSORT, rather than on the number of comparisons performed.
a. Argue that, given an array of size n, the probability that any particular element is chosen as the pivot is $1/n$. Use this to define indicator random variables $X_i$ = I{ith smallest element is chosen as the pivot}. What is $E[X_i]$?
b. Let $T(n)$ be a random variable denoting the running time of quicksort on an array of size n. Argue that
$$E[T(n)] = E[\sum_{q = 1}^n X_q (T(q) + T(n - q) + \Theta(n))]\quad (7.5).$$
c. Show that equation (7.5) can be rewritten as
$$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n-1}E[T(q)] + \Theta(n) \quad (7.6).$$
d. Show that
$$\sum_{k = 2}^{n-1} k \lg k \leq \frac{1}{2}n^2 \lg n - \frac{1}{8}n^2 \quad (7.7).$$
(Hint: Split the summation into two parts, one for $k = 2, 3, \ldots , \lceil n/2 \rceil - 1$ and one for $k = \lceil n/2\rceil , \ldots , n - 1$.)
e. Using the bound from equation (7.7), show that the recurrence in equation (7.6) has the solution $E[T(n)] = \Theta(n \lg n)$. (Hint: Show, by substituting, that $E[T(n)] \leq an \lg n$ for sufficiently large n and for some positive constant a.)
solution:
a. The answer is trivial. $E[X_i] = 1/n$.
b. According to equation (7.1), we have $T(n) = T(q) + T(n - q - 1) + \Theta(n)$, where $0 \leq q \leq n - 1$. Let $q^\prime = q + 1$, we can obtain $T(n) = T(q^\prime - 1) + T(n - q^\prime) + \Theta(n)$. By making use of indicator random variable, $$T(n) = \sum_{q^\prime = 1}^{n}X_{q^\prime}(T(q^\prime) + T(n - q^\prime) + \Theta(n)).$$
c.  Note that indicator random variables $X_1, X_2, \ldots , X_n$ and random variables $T(1), T(2), \ldots , T(n)$ are mutually independent.

Note that $T(0) = T(1) = 0$, so that we can rewrite to $$E[T(n)] = \frac{2}{n}\sum_{q = 2}^{n - 1} E[T(q)] + \Theta(n).$$
d.

e. Assume that $E[T(n)] \leq an \lg n$, and substituing into equation (7.6).

Since we can always find an appropriate constant which can dominate $\Theta(n)$ term, we can get $E[T(n)] = O(n \lg n)$. By exercise 7.4-2, we can conclude that $E[T(n)] = \Theta(n \lg n)$.

Problem 7-3 Stooge sort
Professor Howard, Fine, and Howard have proposed the following "elegant" sorting algorithm:

@{
\proc{Stooge-Sort}(A, i, j)
\li \If A[i] > A[j]
\li \;\;\;\;\textup{exchange}\;A[i]\;\textup{and}\;A[j]
\li \If i + 1 \geq j
\li \;\;\;\;\Return
\li k = \lceil(j - i + 1) / 3\rceil
\li \proc{Stooge-Sort}(A, i, j - k)
\li \proc{Stooge-Sort}(A, i + j, j)
\li \proc{Stooge-Sort}(A, i, j - k)
}@

a. Argue that, if $n = length[A]$, then STOOGE-SORT(A, 1, $length[A]$) correctly sorts the input array $A[1 .. n]$.
b. Give a recurrence for the worst-case running time of STOOGE-SORT and a tight asymptotic ($\Theta$-notation) bound on the worst-case running time.
c. Compare the worst-case running time of STOOGE-SORT with that of insertion sort, merge sort, heapsort, and quicksort. Do the professors deserve tenure?
solution:
b. By the algorithm proposed, we have the recurrence:
$$T(n) = 3T(\frac{2n}{3}) + \Theta(1).$$
Using master theorem, we get
$$T(n) = \Theta(n^{\log_{\frac{3}{2}}3}) \approx \Theta(n^{2.71}).$$
c. It does worse than insertion sort, merge sort, heapsort, and quicksort.

Problem 7-4 Stack depth for quicksort
The QUICKSORT algorithm of Section 7.1 contains two recursive calls to itself. After the call to PARTITION, the left subarray is recursively sorted and then the right subarray is recursively sorted. The second recursive call in QUICKSORT is not really necessary; it can be avoided by using an iterative control structure. This technique, called tail recursion, is provided automatically by good compilers. Consider the following version of quicksort, which simulates tail recursion.
@{
\proc{Quicksort}^{'}(A, p, r)
\li \While p < r
\li \;\;\;\;\textup{// partition and sort left subarray}
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\proc{Quicksort}^{'}(A, p, r)
\li \;\;\;\;p = q + 1
}@

a. Argue that QUICKSORT'(A, 1, $length[A]$) correctly sorts the array A.

Compilers usually execute recursive procedures by using a stack that contains pertinent information, including the parameter values, for each recursive call. The information for the most recent call is at the top of the stack, and the information for the initial call is at the bottom. When a procedure is invoked, its information is pushed onto the stack; when it terminates, its information is poped. Since we assume that array parameters are represented by pointers, the information for each procedure call on the stack requires $O(1)$ stack space. The stack depth is the maximum amount of stack space used at any time during a computation.

b. Describe a scenario in which the stack depth of QUICKSORT' is $\Theta(n)$ on an n-element input array.
c. Modify the code for QUICKSORT' so that the worst-case stack depth is $\Theta(\lg n)$. Maintain the $O(n \lg n)$ expected running time of the algorithm.
solution:
b. When the input array is sorted, it will split into two subarrays. One is of size (n-1), and the other is of size 1. In such situation the stack depth is of size $\Theta(n)$.
QUICKSORT(A, p, r)
QUICKSORT(A, p, r-1)
    $\vdots$
QUICKSORT(A, p, p+1)
c. The problem demonstrated by above scenario is that each invocation of QUICKSORT' calls QUICKSORT' again with almost the same range. If we call QUICKSORT' on smaller subarray returned by PARTITION, we will avoid such problem. Furthermore, the smaller subarray is at most half of the size, we still have $\Theta(\lg n)$ stack depth.

@{
\proc{Quicksort}^{''}(A, p, r)
\li \While p < r
\li \;\;\;\;q = \proc{Partition}(A, p, r)
\li \;\;\;\;\If q - p < r - q
\li \;\;\;\;\;\;\;\;\proc{Quicksort}^{''}(A, p, q - 1)
\li \;\;\;\;\;\;\;\;p =  q + 1
\li \;\;\;\;\Else \proc{Quicksort}^{''}(A, q + 1, r)
\li \;\;\;\;\;\;\;\;r = q - 1
}@


Problem 7-5 Median-of-3 partition
One way to improve the RANDOMIZED-QUICKSORT procedure is to partition around a pivot that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose the pivot as the median (middle element) of a set of 3 elements randomly selected from the subarray. (See exercise 7.4-6.) For this problem, let us assume that the elements in the input array $A[1 .. n]$ are distinct and that $n \geq 3$. We denote the sorted output array by $A^{\prime}[1 .. n]$. Using the median-of-3 method to choose the pivot element $x$ define $p_i = \mbox{Pr}\{x = A^{\prime}[i]\}$.

a. Give an exact formula for $p_i$ as a function of n and i for $i = 2, 3, \ldots , n - 1$. (Note that $p_1 = p_n = 0$.)
b. By what amount have we increased the likelihood of choosing the pivot as $x = A^{\prime}[\lfloor (n+1)/2 \rfloor]$, the median of $A[1 .. n]$, compared to the ordinary implementation? Assume that $n \rightarrow \infty$, and give the limiting ratio of these probabilities.
c. If we define a "good" split to mean choosing the pivot as $x = A^{\prime}[i]$, where $n/3 \leq i \leq 2n/3$, by what amount have we increased the likelihood of getting a good split compared to the ordinary implementation? (Hint: Approximate the sum by an integral.)
d. Argue that in the $\Omega(n \lg n)$ running time of quicksort, the median-of-3 method affects only the constant factor.

solution:

a. The answer is straightforward. $p_i = \frac{(i-1)(n-i)}{C_3^n}$

b. When $i = \lfloor (n + 1)/2 \rfloor$, we get $p_{\lfloor (n + 1)/2 \rfloor} = \frac{(\lfloor (n + 1)/2 \rfloor - 1)(n - \lfloor (n + 1)/2 \rfloor)}{C_3^n}$.

If i is odd,
$$p_i = \frac{(\frac{n-1}{2})^2}{\frac{n(n-1)(n-2)}{6}} = \frac{3(n-1)}{2n(n - 2)},$$
and the ratio is 
$$\frac{\frac{3(n-1)}{2n(n-2)}}{\frac{1}{n}} = \frac{3(n-1)}{2(n-2)}.$$
Otherwise,
$$p_i = \frac{(\frac{n}{2} - 1)(n - \frac{n}{2})}{\frac{n(n-1)(n-2)}{6}} = \frac{3}{2(n - 1)},$$
and the ratio is
$$\frac{\frac{3}{2(n-1)}}{\frac{1}{n}}.$$
Assume that n approaches infinity, the limting ratio is $1.5$ no matter what n is.

c. By ordinary implementation, the probability of a good split is $1/3$. In median-of-3 implementation, the probability of a good split is


Approximate the summation part by integral, let $t = i-1$, $\int_{1}^{n/3-1}t(n-t-1) dt = \frac{1}{2}nt^2 - \frac{1}{3}t^3 - \frac{1}{2}t^2|^{n/3-1}_{1} = \frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}$.

So, $1-\frac{12}{n(n-1)(n-2)}(\frac{7}{162}n^3 -\frac{5}{18}n^2 + \frac{2}{9}n + \frac{2}{3}) = 1 + \frac{-\frac{14}{27}n^3 + \frac{10}{3}n^2 - \frac{8}{3}n - 8}{n(n-1)(n-2)}$. When n approaches infinity, the probability is approaching $1 - \frac{14}{27} = \frac{13}{27}$.

d. Even though we always pick the median of the input array as the pivot during the recursion call, the best running time is still $\Omega(n \lg n)$. So the median-of-3 version of quicksort cannot affect the asymptotic running time, it only affect the constant factor.

Problem 7-6 Fuzzy sorting of intervals
Consider a sorting problem in which the numbers are not known exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given n closed intervals of the form $[a_i, b_i]$, where $a_i \leq b_i$. The goal is to fuzzy-sort these intervals, i.e., produce a permutation $\langle i_1, i_2, \ldots , i_n \rangle$ of the intervals such that there exist $c_j \in [a_{i_j}, b_{i_j}]$, satisfying $c_1 \leq c_2 \leq \cdots \leq c_n$.

a. Design an algorithm of fuzzy-sorting n intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the $a_i$s), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals gets easier and easier. Your algorithm should take advantage of such overlapping, to the extent that it exists.)
b. Argue that your algorithm runs in expected time $\Theta(n \lg n)$ in general, but runs in expected time $\Theta(n)$ when all of the intervals overlap (i.e., when there exists a value x such that $x \in [a_i, b_i]$ for all i.) Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.

solution:
a. & b. reference1, reference2

Note that the overlapping intervals do not require sorting. Since for two overlapping intervals $I_i$, $I_j$, we can always find two constants $c_i$, $c_j$ such that $c_i \geq c_j$ or $c_j \geq c_i$. If n intervals overlap, any of $n!$ permutations is a valid fuzzy sorting result. That's why we can speed up the overall running time if more and more intervals overlap.

Suppose that we have n closed intervals denoted by $I = \{[a_1, b_1], [a_2, b_2], \cdots, [a_n, b_n]\}$. We give the following algorithms to solve the fuzzy-sorting problem:


We adopt a strategy similar to quicksort. We pick an interval as our pivot, and partition the intervals into 3 groups, that is, left group, middle group, and right group. The middle group is intervals which have intersection with the pivot interval. The left group is intervals which are smaller than the pivot interval and have no intersection with the pivot interval. The right group is similar to left group but larger than pivot interval.

In line 1-7, it is similar to PARTITION. At the end of line 7, we have the property that $a_k \leq a_{i+1}$ for $k = p, \ldots , i$ and $a_k > a_{i+1}$ for $k = i+2, \ldots , r$. In line 8-12, we pick up those intervals which not only is smaller than the pivot interval but also overlap the pivot interval. At the end of line 12, we know that $I[p], \cdots , I[q]$ overlap the pivot interval and $I[q+1], \cdots , I[i]$ smaller than pivot interval but have no intersection. In line 13-14, we make $I[p, \ldots , q]$ come closer to the pivot interval by swap $I[p]$ with I[i] and $I[p+1]$ with $I[i-1]$ and so on. In line 16-20, we can also seperate the right group. Finally, the left group is denoted by $I[p, \ldots , q_l - 1]$, and the right group is denoted by $I[q_r + 1, \ldots , r]$.

The running time of subroutine FUZZY-PARTITION is clearly $\Theta(n)$. Hence, the total running time of FUZZY-SORT is $\Theta(n \lg n)$. As more and more intervals overlap, the left group and right group will have fewer and fewer intervals. So the running time will speed up. If all intervals overlap, then the running time is $\Theta(n)$.

2010年8月14日 星期六

[Algo] Exercise 7.4

Exercise 7.4-1
Show that in the recurrence $T(n) = \max_{0 \leq q \leq n-1}(T(q) + T(n - q - 1)) + \Theta(n)$, $T(n) = \Omega(n^2)$.
solution:
Assume that $T(n) \geq cn^2$, where c is a constatnt. Substitute into the recurrence, we can obtain that
$$T(n) \geq \max_{0 \leq q \leq n-1} (cq^2 + c(n - q - 1)^2) + \Theta(n),$$
$$T(n) \geq c \max_{0 \leq q \leq n-1} (q^2 + (n - q - 1)^2) + \Theta(n).$$
We know that $(q^2 + (n - q - 1)^2)$ achieves maximum value when $q = 0$ or $q = n - 1$ with maximum value $(n - 1)^2$. Thus, $T(n) \geq c(n-1)^2 + \Theta(n) = cn^2 - c(2n - 1) + \Theta(n)$. Since we can always get a constant c which is dominated by the $\Theta(n)$ term so $T(n) \geq cn^2$.

Exercise 7.4-2
Show that quicksort's best-case running time is $\Omega(n \lg n)$.
solution:
Let quicksort's best-case running time be $T(n)$, and assume that $T(n) \geq cn \lg n$, where c is a constant. And we know that $T(n) = \min_{0 \leq q \leq n-1} (T(q) + T(n - q - 1)) + \Theta(n)$. Substitute $T(n) \geq cn \lg n$ into the above equation we can get the following inequality:
$$T(n) \geq c \times \min_{0 \leq q \leq n-1} (q \lg q + (n - q - 1)\lg(n - q - 1)) + \Theta(n).$$
We now have to get the minimum value of $q \lg q + (n - q - 1)\lg(n - q - 1)$.


what we have to do is compute the derivative and set it to zero to solve it. First, we show how to compute the derivative,
$$y = f(x)^{g(x)} \Rightarrow \frac{dy}{dx} = f(x)^{g(x)}(g^\prime(x)\ln f(x) + g(x) \cdot \frac{f^\prime(x)}{f(x)})$$
Thus, the derivative of $q^{q}(n - q - 1)^{(n - q- 1)}$ is
$$q^{q}(n - q - 1)^{(n - q- 1)}\ln(\frac{q}{n - q - 1}),$$ solve the equation we can get $q = n - q - 1, q = (n - 1)/2$. Furthermore, the second derivative is
$$q^q(n-q-1)^{(n-q-1)}(\ln^2(\frac{q}{n - q - 1}) + \frac{n - 1}{q(n - q - 1)})\geq 0,$$
we can conclude that $q = (n - 1)/2$ achieves a minimum.
When $q = (n - 1)/2$,
$$T(n) \geq c(2 \times \frac{n-1}{2} \lg \frac{n-1}{2}) + \Theta(n) = c[(n - 1) \lg (n - 1) - (n - 1)] + \Theta(n),$$
where
$$\lg(n - 1) = \lg n(1 - \frac{1}{n}) = \lg n + \lg(1 - \frac{1}{n}).$$
We can rewrite it to
$T(n) \geq c[(n - 1)(\lg n + \lg(1 - \frac{1}{n})) - (n - 1)] + \Theta(n)$
$= c[n \lg n + n \lg(1 - \frac{1}{n}) - \lg n - \lg(1 - \frac{1}{n}) - n + 1] + \Theta(n)$
$= cn \lg n + c(n \lg(1 - \frac{1}{n}) - \lg n - \lg(1 - \frac{1}{n}) - n + 1) + \Theta(n),$ and note that
$$n \lg(1 - \frac{1}{n}) = \lg(1 - \frac{1}{n})^n < \lg e^{-1}.$$ Since we can always find an appropriate constant c which is dominated by $\Theta(n)$. Thus, $T(n) \geq cn \lg n$.

Exercise 7.4-3
Show that $q^2 + (n - q - 1)^2$ achieves a maximum over $q = 0, 1, \ldots, n - 1$ when $q = 0$ or $q = n - 1$.
solution:
$q^2 + (n - q - 1)^2 = 2q^2 -2(n - 1)q + (n - 1)^2$, so it is a convex parabola. Local maximum appears at end points.

Exercise 7.4-4
Show that RANDOMIZED-QUICKSORT's expected running time is $\Omega(n \lg n)$.
solution:
Follow equation 7.4, we can rewrite it to:


Exercise 7.4-5
The running time of quicksort can be improved in practice by taking advantage of the fast running time of insertion sort when its input is "nearly" sorted. When quicksort is called on a subarray with fewer than k elements, let it simply return without sorting the subarray. After the top-level call to quicksort returns, run insertion sort on the entire array to finish the sorting process. Argue that this sorting algorithm runs in $O(nk + n\lg(n/k))$ expected time. How should k be picked, both in theory and in practice?
solution:
Since quicksort returns when the subarray is fewer than k elements, its depth is $O(\lg(n/k))$. Running insertion sort on an almost-sorted array is O(nk). Thus the total running time is $O(nk + n\lg(n/k))$.

Exercise 7.4-6
Consider modifying the PARTITION procedure by randomly picking three elements from array A and partitioning about their median (the middle value of the three elements). Approximate the probability of getting at worst an $\alpha$-to-$(1 - \alpha)$ split, as a function of $\alpha$ in the range $0 < \alpha < 1$.
solution:
The answer is straightforward. In order to have an $\alpha$-to-$(1 - \alpha)$ split, we pick up one element from $n\alpha$ elements and one elements from $n(1-\alpha)$ elements. Hence, the probability is $$\frac{(n\alpha)[n(1 - \alpha)]}{C_3^n}$$

2010年8月7日 星期六

[Algo] 7.4 Analysis of quicksort

7.4.1 Worst-case analysis
我們現在將要來證明一下,在 worst-case 的 split 下,整個 quicksort 的 running time 是 $\Theta(n^2)$。我們要使用的方法是 substitution method,我們先證明整個 running time 是 $O(n^2)$。
令 $T(n)$ 是對一個長度為 n 的 sequence 執行 quicksort 的 worst-case running time。我們可以得到如下的表示:
$$T(n) = \max_{0 \leq q \leq n-1} (T(q) + T(n - 1 - q)) + \Theta(n)$$其中 q 的範圍是從 0 到 n-1,因為 quicksort 會 split 成兩個 subproblem。

使用 substitution method,我們猜想 $T(n) \leq cn^2$,其中 c 是個常數。把這個猜想代入上一個式子我們會得到:


其中 $q^2 + (n - 1 - q)^2$ 在 $q = 0$ 或 $q = n-1$ 時才會有極大值,所以我們可以知道$$q^2 + (n - 1 - q)^2 \leq (n - 1)^2 \leq n^2 - 2n + 1$$
由此可知,
$$T(n) \leq c(n^2 - 2n + 1) + \Theta(n) = cn^2 -c(2n - 1) + \Theta(n) \leq cn^2$$
因為我們總是可以選一個適當的常數 c 來讓 $c(2n - 1)$ 這一項比 $\Theta(n)$ 這一項還來的大。所以我們就可以證明出 $T(n) = O(n^2)$

7.4.2 Expected running time
quicksort 的 running time 主要是花在 PARTITION 這個 procedure 上面。所以要分析 quicksort 最好就是從 PARTITION 這個 procedure 上面下手。值得注意的事情是,每一次 call PARTITION 這個 procedure 的時候,總是會有一個 pivot 被選上,而且這個 pivot 在往後的 recursion call 是不會繼續參與到的,換句話說,在整個 quicksort 的演算法,PARTITION 這個 procedure 最多只會被 call 到 n 次而已。
PARTITION 這個 procedure 的 running time 跟第 3~6 行的 for loop 迴圈次數有關,也就是說如果我們可以算出第 4 行總共被執行到幾次 的話,我們就可以得到 quicksort 的 running time。

Lemma 7.1
Let X be the number of comparisons performed in line 4 of PARTITION over the entire execution of QUICKSORT on an n-element array. Then the running time of QUICKSORT is $O(n + X)$.

所以我們現在著眼於算出 X,令 array A 裡的 elements 為 $z_1, z_2, \ldots, z_n$,其中 $z_i$ 為第 i 小的 element。另外,我們也定義集合 $Z_{ij} = \{z_i, z_{i+1}, \ldots, z_j\}$ 為從 $z_i$ 到 $z_j$ 之間的所有 elements。此外,我們也定義一個 indicator random variable $X_{ij} = I\{z_i \mbox{ is compared to } z_j\}$,由此我們可以知道
$$X = \sum_{i = 1}^{n-1}\sum_{j = i+1}^n X_{ij}$$
於是我們可以得到
$$E[X] = E[\sum_{i = 1}^{n-1}\sum_{j = i+1}^n X_{ij}] = \sum_{i = 1}^{n-1}\sum_{j = i+1}^n E[X_{ij}] = \sum_{i = 1}^{n-1}\sum_{j = i+1}^n  \mbox{Pr} \{z_i \mbox{ is compared to } z_j\} $$
接下來我們就剩下看看如何算這個機率了,假設現在有一個 input 是 $A = \{1, 2, \ldots, 10\}$, 當我們 call quicksort 的時候,假設第一次選到的 pivot 是 7,那麼 PARTITION 會把 array A 分成兩個 subproblems:$\{1, 2, 3, 4, 5, 6\}$ 和 $\{8, 9, 10\}$,在這樣的情況下 pivot 7 會和這兩個 sets 的所有 number 做 comparison,但是第一個 set 裡面的 number 絕對不會和第二個 set 裡面的任何一個數字 compare 到。
一般來說,一旦 x 被選到成為 pivot,那麼對任何兩個 $z_i$ 和 $z_j$,如果 $z_i < x < z_j$ 的話,$z_i$ 和 $z_j$ 是不會比到大小的。所以對一個 set $Z_{ij}$ 來說,除非 $z_i$ 或 $z_j$ 被選為 pivot,否則 $z_i$ 和 $z_j$ 是不會作到 comparison 的。舉例:7 被選為 pivot 的時候,$Z_{2,9}$ 的 2 和 9 是不會比到大小的。
假設 $Z_{ij}$ 裡的所有 numbers 被選為 pivot 的機率都一樣,那每一個 number 被選為 pivot 的機率就是:$1/(j - i +1)$。所以 Pr{$z_i$ is compared to $z_j$} = Pr{$z_i$ or $z_j$ is the first pivot chosen from $Z_{ij}$} = Pr{$z_i$ is the first pivot chosen from $Z_{ij}$} + Pr{$z_j$ is the first pivot chosen from $Z_{ij}$} = $2/(j - i + 1)$,所以我們可以得到:
$$E[X] = \sum_{i = 1}^{n-1}\sum_{j = i + 1}^n \frac{2}{j-i+1} < \sum_{i = 1}^{n-1}\sum_{j = i + 1}^n \frac{2}{j-i} = \sum_{i = 1}^{n-1}O(\lg n) = O(n \lg n)$$
於是我們可以得證 RANDOMIZED-PARTITION 的 expected running time 是 $O(n \lg n)$

[Algo] Exercise 7.3

Exercise 7.3-1
Why do we analyze the average-case performance of a randomized algorithm and not its worst-case performance?
Solution:
Randomized algorithm will not improve the worst case. What randomization can do is make the chance of encountering a worst-case scenario small.


Exercise 7.3-2
During the running time of the procedure RANDOMIZED-QUICKSORT, how many calls are made to the random-number generator RANDOM in the worst case? How about in the best case case? Give your answer in terms of $\Theta$-notation.
Solution:
There are $\Theta(n)$ calls to RANDOM in the worst case. There are also $\Theta(n)$ calls to RANDOM in the best case. Because each call to RANDOM chooses a pivot, and after this call the pivot will not participate in further partition. There are n items in the list, and there will be at most n call to RANDOM.