Question

In: Computer Science

Quick Sort as explained in class uses a Pivot value to arrangeall numbers greater than...

Quick Sort as explained in class uses a Pivot value to arrange all numbers greater than the pivot on one

side and all values smaller than the pivot on the other side. Pivot in the class example used the first

element of the array. “Median of three” rule uses the median of the first last and the middle elements

of the array as the Pivot so that we avoid the chance of picking the smallest or the largest element in the

array.

a) Write the status of the list F= (12,2,16, 30,8,28,4,10,20,6,18) at the end of each phase of

recursive Quick sort using median of three rule to determine the pivot key.

b) Show that recursive QuickSort takes O(n^2) time when input list is already in sorted order

c) Write a non recursive version of Quicksort incorporating the median of three rule to determine

the pivot key. You can use the F array above from a) for your testing.

d) Show that the non recursive version of Quicksort above takes O( n logn) time on already sorted

list.

Solutions

Expert Solution

Worst-case running time

When quicksort always has the most unbalanced partitions possible, then the original call takes cncnc, n time for some constant ccc, the recursive call on n-1n−1n, minus, 1elements takes c(n-1)c(n−1)c, left parenthesis, n, minus, 1, right parenthesis time, the recursive call on n-2n−2n, minus, 2 elements takes c(n-2)c(n−2)c, left parenthesis, n, minus, 2, right parenthesis time, and so on. Here's a tree of the subproblem sizes with their partitioning times:

Diagram of worst case performance for Quick Sort

When we total up the partitioning times for each level, we get

\begin{aligned} cn + c(n-1) + c(n-2) + \cdots + 2c &= c(n + (n-1) + (n-2) + \cdots + 2) \\ &= c((n+1)(n/2) - 1) \ . \end{aligned}cn+c(n−1)+c(n−2)+⋯+2c​=c(n+(n−1)+(n−2)+⋯+2)=c((n+1)(n/2)−1) .​

The last line is because 1 + 2 + 3 + \cdots + n1+2+3+⋯+n is the arithmetic series, as we saw when we analyzed selection sort. (We subtract 1 because for quicksort, the summation starts at 2, not 1.) We have some low-order terms and constant coefficients, but when we use big-Θ notation, we ignore them. In big-Θ notation, quicksort's worst-case running time is Θ(n^2).

Average-case running time

Showing that the average-case running time is also \Theta(n \log_2 n)Θ(nlog2​n) takes some pretty involved mathematics, and so we won't go there. But we can gain some intuition by looking at a couple of other cases to understand why it might be O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis. (Once we have O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis, the \Theta(n \log_2 n)Θ(nlog2​n) bound follows because the average-case running time cannot be better than the best-case running time.) First, let's imagine that we don't always get evenly balanced partitions, but that we always get at worst a 3-to-1 split. That is, imagine that each time we partition, one side gets 3n/43n/43, n, slash, 4 elements and the other side gets n/4n/4n, slash, 4. (To keep the math clean, let's not worry about the pivot.) Then the tree of subproblem sizes and partitioning times would look like this:

Diagram of average case performance for Quick Sort

The left child of each node represents a subproblem size 1/4 as large, and the right child represents a subproblem size 3/4 as large. Since the smaller subproblems are on the left, by following a path of left children, we get from the root down to a subproblem size of 1 faster than along any other path. As the figure shows, after \log_4 nlog4​nlog, start subscript, 4, end subscript, n levels, we get down to a subproblem size of 1. Why \log_4 nlog4​nlog, start subscript, 4, end subscript, n levels? It might be easiest to think in terms of starting with a subproblem size of 1 and multiplying it by 4 until we reach nnn. In other words, we're asking for what value of xxx is 4^x = n4x=n4, start superscript, x, end superscript, equals, n? The answer is \log_4 nlog4​nlog, start subscript, 4, end subscript, n. How about going down a path of right children? The figure shows that it takes \log_{4/3} nlog4/3​nlog, start subscript, 4, slash, 3, end subscript, nlevels to get down to a subproblem of size 1. Why \log_{4/3} nlog4/3​nlog, start subscript, 4, slash, 3, end subscript, n levels? Since each right child is 3/4 of the size of the node above it (its parent node), each parent is 4/3 times the size of its right child. Let's again think of starting with a subproblem of size 1 and multiplying the size by 4/3 until we reach nnn. For what value of xxx is (4/3)^x = n(4/3)x=nleft parenthesis, 4, slash, 3, right parenthesis, start superscript, x, end superscript, equals, n? The answer is \log_{4/3} nlog4/3​nlog, start subscript, 4, slash, 3, end subscript, n.

In each of the first \log_4 nlog4​nlog, start subscript, 4, end subscript, n levels, there are nnn nodes (again, including pivots that in reality are no longer being partitioned), and so the total partitioning time for each of these levels is cncnc, n. But what about the rest of the levels? Each has fewer than n nodes, and so the partitioning time for every level is at most cncnc, n. Altogether, there are \log_{4/3} nlog4/3​nlog, start subscript, 4, slash, 3, end subscript, n levels, and so the total partitioning time is O(n \log_{4/3} n)O(nlog4/3​n)O, left parenthesis, n, log, start subscript, 4, slash, 3, end subscript, n, right parenthesis. Now, there's a mathematical fact that

\displaystyle \log_a n = \frac{\log_b n}{\log_b a}logan=logbalogbn​log, start subscript, a, end subscript, n, equals, start fraction, log, start subscript, b, end subscript, n, divided by, log, start subscript, b, end subscript, a, end fraction

for all positive numbers aaa, bbb, and nnn. Letting a = 4/3a=4/3a, equals, 4, slash, 3 and b = 2b=2b, equals, 2, we get that

\displaystyle \log_{4/3} n = \dfrac{\log_2 n}{\log_2 (4/3)} \ ,log4/3​n=log2​(4/3)log2​n​ ,log, start subscript, 4, slash, 3, end subscript, n, equals, start fraction, log, start subscript, 2, end subscript, n, divided by, log, start subscript, 2, end subscript, left parenthesis, 4, slash, 3, right parenthesis, end fraction, space, comma

and so \log_{4/3} nlog4/3​nlog, start subscript, 4, slash, 3, end subscript, n and \log_2 nlog2​nlog, start subscript, 2, end subscript, n differ by only a factor of \log_2 (4/3)log2​(4/3)log, start subscript, 2, end subscript, left parenthesis, 4, slash, 3, right parenthesis, which is a constant. Since constant factors don't matter when we use big-O notation, we can say that if all the splits are 3-to-1, then quicksort's running time is O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis, albeit with a larger hidden constant factor than the best-case running time.

How often should we expect to see a split that's 3-to-1 or better? It depends on how we choose the pivot. Let's imagine that the pivot is equally likely to end up anywhere in an nnn-element subarray after partitioning. Then to get a split that is 3-to-1 or better, the pivot would have to be somewhere in the "middle half":

So, if the pivot is equally likely to end up anywhere in the subarray after partitioning, there's a 50% chance of getting at worst a 3-to-1 split. In other words, we expect a split of 3-to-1 or better about half the time.

The other case we'll look at to understand why quicksort's average-case running time is O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis is what would happen if the half of the time that we don't get a 3-to-1 split, we got the worst-case split. Let's suppose that the 3-to-1 and worst-case splits alternate, and think of a node in the tree with kkkelements in its subarray. Then we'd see a part of the tree that looks like this:

instead of like this:

Therefore, even if we got the worst-case split half the time and a split that's 3-to-1 or better half the time, the running time would be about twice the running time of getting a 3-to-1 split every time. Again, that's just a constant factor, and it gets absorbed into the big-O notation, and so in this case, where we alternate between worst-case and 3-to-1 splits, the running time is O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis.

Bear in mind that this analysis is not mathematically rigorous, but it gives you an intuitive idea of why the average-case running time might be O(n \log_2 n)O(nlog2​n)O, left parenthesis, n, log, start subscript, 2, end subscript, n, right parenthesis.


Related Solutions

use quick sort to sort the following array. show each pass and what the pivot is...
use quick sort to sort the following array. show each pass and what the pivot is during the pass. please explain why you are swapping positions. do not use a compiler. 30 5 40 11 20 9 15 2 60 25 80 3 73 35 4 75 20 6
Modify the quicksort algorithm such that it uses the last item as the pivot instead of the 1st. Also, sort in descending order, instead of ascending order.
Programming Language : JavaModify the quicksort algorithm such that it uses the last item as the pivot instead of the 1st. Also, sort in descending order, instead of ascending order.NOTE: Do not move the last element into the first element of the array. You must treat the algorithm as if the pivot is actually sitting in the last location of the array.After it has been sorted in descending order, go through all the items in the array and make sure...
Why is the future value of annuity due always greater than the future value of an...
Why is the future value of annuity due always greater than the future value of an ordinary annuity assuming everything equals 2? Why the present value of an annuity due is always greater that than the future value of an ordinary annuity assuming everything else equals?
. Women experience poverty in greater numbers than men for many reasons. Identify and discuss two...
. Women experience poverty in greater numbers than men for many reasons. Identify and discuss two (2) of the main reasons.
C++ Programa that: // 1) Ask the user to enter two numbers greater than zero. Store...
C++ Programa that: // 1) Ask the user to enter two numbers greater than zero. Store the //    values into two variables named 'n' and 'm'. //    Additionally, implement a while-loop to keep asking the user to enter //    the numbers again if any of the two numbers are equal or smaller than zero. // 2) If the two numbers (n, m) are equal, display the message "The //    numbers are the same". If the two numbers are different, display...
Supply is elastic whenever the value of the elasticity of supply is positive and greater than...
Supply is elastic whenever the value of the elasticity of supply is positive and greater than 1. a) True b) False The price elasticity of supply is calculated as the change in supply divided by the change in price. a) True b) False If the income elasticity for canned food is 0.8, then canned food is an inferior good. a) True b) False The demand for heating oil in the short run is more elastic than the long run demand...
2. As long as the interest rate is greater than zero, the present value of a...
2. As long as the interest rate is greater than zero, the present value of a single sum will always: Select one: a. Equal the future value if the time period is one year. b. Increase as the interest rate increases. c. Be more than the future value. d. Increase as the number of periods decreases. e. Decrease as the number of periods increases. 3. A project costs $525 and has cash flows of $100 for the first three years...
The value of a college degree is greater than it has been in nearly half a...
The value of a college degree is greater than it has been in nearly half a century, a least when compared to the prospect of not getting a degree (www.pewresearch.org, January 28, 2014). Due to this fact, more and more people are obtaining college degrees, despite the soaring costs. The accompanying table shows the proportions of college degrees awarded in 2010 by colleges and universities, categorized by a graduate's race and ethnicity. The race and ethnicity of 500 recent graduates...
The number that represents the value that 95% of all expected outcomes should be greater than...
The number that represents the value that 95% of all expected outcomes should be greater than is the ___________. a. VaR(0.95) b. volatility c. VaR(0.05) d. Sharpe ratio
If the billable hours for that individual time entry is greater than 3, set the value...
If the billable hours for that individual time entry is greater than 3, set the value of the cell to "High Hours". Otherwise, set the cell equal to "Normal Hours. This is an excel worksheet. How do I set the formula for this?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT