Data Structures and Algorithms -- CSCI 230
Algorithm Analysis

Motivating Examples


Here are three standard algorithms -- two searches and one sort -- which should be analyzed to determine their computational efficiency.



Sequential Search:

//  Sequentially search an array of n elements to determine if a given
//  value is there.  If so, set loc to be the first array location
//  containing it and return true.  Otherwise, return false.

bool
SeqSearch( float arr[], int n, float value, int & loc )
{
  loc=0;
  while ( loc<n && arr[loc] != value ) {
    ++ loc;
  }
  return loc<n;
}



Insertion Sort:

//  Sort an array of n elements using insertion sort.

void
InsertSort( float arr[], int n )
{
  for ( int i=1; i<n; i++ ) {
    float temp = arr[i];
    int j = i-1;
    while ( j>=0 && arr[j] > temp ) {
      arr[j+1] = arr[j];
      j -- ;
    }
    arr[j+1] = temp;
  }
}

Binary Search:

//  Use binary search to determine if a given value is somewhere in an  
//  ordered array of n elements.  If so, set loc to be the first array
//  location containing it and return true.  Otherwise, set loc to be
//  the array location where it should be inserted and return false.
//  The array ordering is assumed to be what's called "non-decreasing"
//  order, which means that 
//     arr[0] <= arr[1] <= ... <= arr[n-1]
//  or, more precisely,
//     for 0 <= i < n-1, arr[i] <= arr[i+1]

bool
BinSearch( float arr[], int n, float value, int & loc )
{
  int low = 0, high = n-1, mid;
    
  //  Before each iteration of the loop, the following
  //  conditions hold:
  //     0 <= low < high < n, 
  //     for each j, 0 <= j < low, arr[j] < value
  //     for each j, high <= j < n, value <= arr[j]
  //
  while ( low < high ) {
    mid = (low + high) / 2;
    if ( value <= arr[mid] )
      high = mid;
    else
      low = mid+1;
  }

  loc = low;
  if (arr[loc] == value)
    return true;
  else {
    if ( loc == n-1 && arr[n-1] < value ) loc = n;
    return false;
  }
}

Exercise


How many operations, as a function of the array size n, are required by SeqSearch. If you finish this, try to answer the same question for InsertSort. What issues arose in your discussion?

Algorithm Analysis Rules


Order Notation


Order notation is a mathematical formalism used to summarize the computation time required by an algorithm, simplifying the function derived to count the number of operations. On the positive side, this avoids quibbling over the number of operations (and cost) involved in simple algorithmic steps. On the negative side, this does result in some loss of imprecision.

Order Notation -- Rules for Manipulation


Order Notation Exerices


1.
Show that 5n2 + 6n = O(n2) using the original definition of ``O'' and then using limits.
2.
For each pair of functions, T(n) and f(n), determine which of the following hold:

\begin{displaymath}
T(n) = O(f(n)) \qquad
T(n) = \theta(f(n))\end{displaymath}

Justify your answer. (Assume k, a and b are unspecified constants greater than 1 and a > b.)
(a)
$T(n) = n^2 \displaystyle \log n + 5 n$,f(n) = n3
(b)
$T(n) = \displaystyle \log (n^k)$,$f(n) = (\displaystyle \log n)^k$

(c)
$T(n) = \displaystyle \log_a n$,$f(n) = \displaystyle \log_b n$

(d)
T(n) = 2n, f(n) = 2(2n).

3.
Give the best possible O estimate for T(n),
(a)
$T(n) = (n^3 + 10 n^2) \cdot ( n^3 \log n + 20 n^4 )$
(b)
$T(n) = n 3^n + n^{10} + 1500 n^3 \displaystyle \log n$.

(c)
$T(n) = \sum_{i=1}^n 5 i(i-1)$

4.
Derive an ``O'' estimate for the worst-case of InsertSort based on the function we derived in class.

Algorithm Analysis Exercises


1.
Count the number of operations in each of the following two code fragments as a function of n, the length of the array. Each should yield a summation. Then, analyze each summation to give the best possible ``O'' estimate for each fragment.
(a)
    //  assume  arr  is an array containing n integers
    int k = 5;
    for ( int i=0; i<=n-k; i++ ) {
      sum = 0;
      for ( int j=i; j<i+k; j++ ) {
        sum += arr[j];
      }
      cout << "Sum of elements " << i << " through "
           << i+k-1 << " is " << sum << "\n";
    }


(b)
    //  assume  arr  is an array containing n integers
    int k = n/2;
    for ( int i=0; i<=n-k; i++ ) {
      sum = 0;
      for ( int j=i; j<i+k; j++ ) {
        sum += arr[j];
      }
      cout << "Sum of elements " << i << " through "
           << i+k-1 << " is " << sum << "\n";
    }

2.
Rewrite the second code fragment to make it as efficient as possible. Start by thinking carefully about what it actually does! What is the complexity of your new code fragment?

3.
In the analysis of InsertSort we assumed that the worst-case would occur at all times. What must be the state of the array for the absolute maximum number of operations to occur? Repeat the analysis of InsertSort to derive average case and best case estimates? What state of the array causes the best-case to occur?

More advanced analysis:


Max Subsequence Sum


Exercises:


1.
Derive a recursive equation to analyze MergeSortand then solve this equation. Assume the array is of size n=2k for integer $k
\geq 0$.
2.
Find an efficient algorithm (along with a running time analysis) to find the minimum subsequence sum.

Review Problems


Here are a few review problems which have appeared on homeworks or tests in previous semesters. Practice writing solutions carefully and then compare to solutions provided on-line. If you can solve these problems and the problems we worked on in class then you are ready for the chapter quiz!

1.
Show that ${\displaystyle \sum_{i=1}^n 2 i^3 = O(n^4)}$.
2.
For each of the following, find f(n) such that t(n) = O( f(n)). Make f(n) as small and simple as possible, i.e. don't write t(n) = O(n4) when t(n) = O(n3). Justify your answers.
(a)
$ \displaystyle t(n) = 13 n^2 + 2^n $
(b)
$ \displaystyle t(n) = 5 (n + 3 \log n) (n \log n + 13) \log n +
13 n^2$
(c)
$ \displaystyle t(n) = \sum_{i=3}^{n} \sum_{j=i}^n i (n - j)$

3.
Exercise 2.6a from the text. Try to derive summations first. Note program fragment (6) is quite difficult.

4.
Derive a summation to count, as a function of n, the number of times Hello is output by each of the following code fragments. Obtain an accurate``O'' estimate from the summation.

(a)
    for ( i=1; i<=n; i++ ) 
      for ( j=1; j<=i; j++ )
        for ( k=j+1; k<=n; k++ )
          cout << "Hello\n";
(b)
For this part, assume n = 2k and assume the notation 2^i means 2i.
    for ( i=0; i<=k; i++ ) 
      for ( j=2^i+1; j<=n; j++ )
        cout << "Hello\n";

5.
Exercise 2.11 of the text.

6.
Write an algorithm that takes an unsorted list (provided as an array) of n floating point values and returns the smallest difference between any two values in the list. For example, for the list
        2.9, 3.5, 1.1, 6.1, 2.3, 1.8, 8.7, 3.0, 2.4,
the algorithm should return 0.1, which is the difference between 3.0 and 2.9. Make your algorithm as efficient as you can and give the worst-case running time of your algorithm as a function of n, briefly justifying your answer. Hint: you may change or reorganize the contents of the array.



 

Charles Stewart
9/3/1998