CLRS 2 problem solutions

2-1 Insertion sort on small arrays in merge sort

Although merge sort runs in \(\Theta(n \lg n)\) worst-case time and insertion sort runs in \(\Theta(n^2)\) worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when subproblems become sufficiently small. Consider a modification to merge sort in which \(n/k\) sublists of length \(k\) are sorted using insertion sort and then merged using the standard merging mechanism, where \(k\) is a value to be determined.

Show that insertion sort can sort the \(n/k\) sublists, each of length \(k,\) in \(\Theta(n/k)\) worst-case time.
Show how to merge the sublists in \(\Theta(n \lg(n/k))\) worst-case time.
Given that the modified algorithm runs in \(\Theta(nk + n\lg(n/k))\) worst-case time, what is the largest value of \(k\) as a function of \(n\) for which the modified algorithm has the same running time as standard merge sort, in terms of \(\Theta\)-notation?
How should we choose \(k\) in practice?

Sol’n

Insertion-sort sorts each list of length \(k\) in \(k^2\) time. There’s \(n/k\) of them, so time to sort is \(k^2 \cdot n/k = nk.\)
Each merge takes \(n\) time, and we descend to \(\lg(n/k)\) of them, so the time to merge is \(\Theta(n \lg (n/k)).\)
For smaller values of \(k\), the right term, \(n\lg(n/k)\) term dominates. As we look for the largest possible value for which we have \(\Theta(n\lg n)\) behavior, we focus on when the left term \(nk\) has reached the limit in behavior. Then, the solution is for \(k = \lg n,\) where indeed the right term \(n \lg(n/\lg n)\) is the smaller of the two.
My guess is \(k = \lg \lg n,\) at least this gives terms I believe strictly less than for for example \(k = 1\) and \(k = \lg n\), but it didn’t immediately occur to me how to more rigorously optimise.

2-2 Correctness of bubblesort

Bubblesort is a popular, but inefficient, sorting algorithm. It works by repeatedly swapping adjacent elements that are out of order.

BUBBLESORT(A)
1  for i = 1 to A.length - 1
2    for j = A.length downto i + 1
3      if A[j] < A[j - 1]
4        exchange A[j] with A[j - 1]

Let \(A'\) denote the output of \(BUBBLESORT(A).\) To prove that \(BUBBLESORT\) is correct, we need to prove that it terminates and that

\[ A'[1] \le A'[2] \le \cdots \le A'[n]. \tag{2.3} \]

where \(n = A.length.\) In order to show that \(BUBBLESORT\) actually sorts, what else do we need to prove? The next two parts will prove inequality \((2.3).\)

State precisely a loop invariant for the for loop in lines 2–4, and prove that this loop invariant holds. Your proof should use the structure of the loop invariant proof presented in this chapter.
Using the termination condition of the loop invariant proved in part (b), state a loop invariant for the for loop in lines 1–4 that will allow you to prove inequality \((2.3).\) Your proof should use the structure of the loop invariant proof presented in this chapter.
What is the worst-case running time of bubblesort? How does it compare to the running time of insertion sort?

Sol’n

TODO
Loop invariant:

At the beginning of each iteration of the loop, \(A[j] \le \min A[j + 1..n].\)

Initialization:

The first loop starts off with \(j = n\), where we trivially have

\[ \begin{align*} A[j] & \le \min A[j + 1..n] \\ & = \min A[n + 1..n] \\ & = \min \emptyset \\ & = \infty. \end{align*} \]

Maintenance:

During an iteration, for \(j\), we start out with \(A[j] \le \min A[j + 1..n],\) and after the conditional on lines 3 and 4, we will have some \(A\) such that \(A'[j - 1] \le \min A[j..n].\)

Termination:

Termination is at \(j = i\), at which point we have \(A[i] \le \min A[i + 1..n],\) which we expect will be a useful result for the outer loop.

Loop invariant:

At the beginning of each iteration of the loop, the elements \(A[1..i -1]\) will be sorted and no bigger than any elements in \(A[i..n].\)

Initialization:

For \(i = 1,\) then \(A[1..0] = \emptyset,\) and the various properties hold.

Maintenance:

We fix \(i\), and assume \(A[1..i - 1]\) is sorted, and all elements are no bigger than any in \(A[i..n].\) After the inner loop, we will have \(A'[i]\) no bigger than any element in \(A'[i + 1..n]\), and clearly \(A'[1..i]\) is sorted.

Termination:

For \(i = n,\) the property gives us that \(A[1..n - 1]\) is sorted, and all elements are no bigger than any in \(A[n].\) Then \(A = A[1..n]\) is sorted, as desired.

Approximately, we have \(n + (n - 1) + \cdots + 1,\) with the usual \(\Theta(n^2).\)

2-3 Correctness of Horner’s rule

The following code fragment implements Horner’s rule for evaluating a polynomial

\[ \begin{align*} P(x) & = \sum_{k = 0}^n a_k x^k \\ & = a_0 + x(a_1 + x(a_2 + \cdots + x(a_{n - 1} + xa_n)\cdots)), \end{align*} \]

given the coefficients \(a_0, a_1, \dots, a_n\) and a value for \(x.\)

1  y = 0
2  for i = n downto 0
3    y = ai + x·y

In terms of \(\Theta\)-notation, what is the running time of this code fragment for Horner’s rule?
Write pseudocode to implement the naive polynomial-evaluation algorithm that computes each term of the polynomial from scratch. What is the running time of this algorithm? How does it compare to Horner’s rule?
Consider the following loop invariant: At the start of each iteration of the for loop of lines 2–3,

\[ y = \sum_{k = 0}^{n - (i + 1)} a_{k + i + 1}x^k. \]

Interpret a summation with no terms as equaling 0. Following the structure of the loop invariant proof presented in this chapter, use this loop invariant to show that, at termination, \(y = \sum_{k = 0}^n a_k x^k.\)

Conclude by arguing that the given code fragment correctly evaluates a poly- nomial characterized by the coefficients \(a_0, a_1, \dots, a_n.\)

Sol’n

\(\Theta(n).\)

P(x)
result = 0
for k = 0 to n
  term = ak
  for i = 1 to k
    term = term * x
  result = result + term
return result

\(\Theta(n^2).\)

TODO

2-4 Inversions

Let \(A[1..n]\) be an array of n distinct numbers. If \(i < j\) and \(A[i]> A[j]\) , then the pair \((i, j)\) is called an inversion of \(A.\)

List the five inversions of the array \(\langle 2, 3, 8, 6, 1 \rangle.\)
What array with elements from the set \(\\{1, 2, \dots, n\\}\) has the most inversions? How many does it have?
What is the relationship between the running time of insertion sort and the number of inversions in the input array? Justify your answer.
Give an algorithm that determines the number of inversions in any permutation on \(n\) elements in \(\Theta(n \lg n)\) worst-case time. (Hint: Modify merge sort.)

Sol’n

\(\\{ (2, 1), (3, 1), (8, 6), (8, 1), (6, 1) \\}.\)
\(\\{ n, n - 1, \dots, 1 \\}.\)

Then number of inversions is approximately \(n + (n - 1) + \cdots + 1,\) which gives sum \(n (n + 1) / 2.\)

TODO
TODO