CSCI 2300 Lab 3, Fall 2017, due Oct 18

In this lab you are asked to implement three different algorithms for median finding, or more generally for selection of the k'th smallest element in an unordered array. You will then be asked to compare these algorithms.

The three algorithms have the same input and output:

Algorithm 1: Randomized selection

This is the algorithm we cover in class. It is described in Section 2.4 of the textbook.

Algorithm 2: Deterministic selection (median of medians)

Although the randomized selection algorithm runs in time O(n) in expectation, it is actually possible to design a deterministic algorithm for selection which also runs in time O(n). The main difference between the deterministic selection algorithm and the randomized selection algorithm in Section 2.4 of the textbook is how to choose the pivot. In Algorithm 2, divide A into groups of 5 elements (except the last group), then compute the medians M[1,..,n/5] of these groups using sorting (which takes O(1) per group), then use the selection algorithm recursively to compute the median of M[1,..,n/5] and use this value as the pivot to divide A into Sl, Sv, and Sr as before. When the size of input A is not too large (e.g., no more than 10) you are allowed to use any sorting algorithm (quicksort, mergesort, bubblesort, etc.) In you need more details, feel free to look at the following Lecture Notes.

Algorithm 3: Sorting

Implement Quicksort (page 56 of textbook), then use it to sort A, and return the k'th smallest element. Note that most of the code for Quicksort should be very similar to your Algorithm 1. Quicksort runs in expected O(n log n) time.

  1. Show your code and the outputs for [7, 2, 4, 6, 9, 11, 2, 6, 10, 6, 15, 6, 14, 2, 7, 5, 13, 9, 12, 15], k= 10 to your TA.
  2. Generate a random list of n=10^7 integers, with each random integer being uniformly distributed between 0 and n/100. Use your three algorithms to compute the median of this list (the n/2 smallest element), and time the running time of each algorithm. Depending on your machine and implementation, this may take a few minutes; we suggest starting with smaller n first to make sure everything is working properly. Which algorithm performs the best? What if you make the list size n even larger?