Lecture 24 — Problem Solving and Design

Overview

This is the last “official” set of lecture notes. Material from these notes will be on the final.

  • Design:
    • Choice of container/data structure; choice of algorithm
    • Implementation
    • Testing
    • Debugging
  • We will discuss these in the context of several variations on one problem:
    • Finding the mode in a sequence of values — the value (or values) occuring most often.
  • Our discussion is loosely based on Chapter 12, but there are many things in this chapter we will skip:
    • Discussion of functions, default parameters, variable numbers of arguments
    • Exceptions
    • Specific design patterns
  • We will start with a completely blank slate so that the whole process unfolds from scratch. This includes looking for other code to adapt.

Problem: Finding the Mode

  • Given a series of values, find the one that occurs most often.
  • Variation 1: is there a limited, indexable range of values?
    • Examples that are consistent with this variation include test scores or letters of the alphabet
    • Examples not consistent include counting words and counting amino acids
  • Variation 2: do we want just the modes or do we want to know how many times each value occurs?
  • Variation 3: do we want a histogram where values are grouped?
    • Example: ocean temperature measurements, pixel intensities, income values.
    • In each of these cases, a specific value, the number of occurrences of a specific ocean, such as 2.314C, is not really of interest. More important is the number of temperature values in certain ranges.

Our Focus: A Sequence of Numbers

  • Integers, such as test scores
  • Floats, such as temperature measurements

Sequence of Discussion

  • Brainstorm ideas for the basic approach. We’ll come with at least three.
  • Algorithm / implementation
  • Testing
    • Generate test cases
    • Which test cases we generate will depend on the choice of algorithm. We will combine them.
  • Debugging:
    • If we find a failed test case, we will need to find the error and fix it.
    • Use a combination of carefully reading the code, working with a debugger, and generating print statements.
  • Evaluation:
    • Theoretical
    • Experimental timing

Discussion of Variations

  • Frequency of occurrence:
    • What are the ten most frequently occurring values? What are the top ten percent most frequent values?
    • Output the occurrences for each value.
  • Clusters / histograms:
    • Test scores in each range of 10
  • Quantiles: bottom 25% of scores, median, top 25%

Practice Problems

  • These will be generated after class based on how the discussion progresses.

Summary of Problem Solving Steps

  1. Understand the problem: play with examples, ask (yourself) questions, consider variations.
  2. Think about the core approach, algorithm(s) and data structure(s.
    • Several choices are often possible
    • The choice dictates the details of what follows.
  3. Narrow choices based on considerations of efficiency, clarity and ease of implementation.
  4. Gradually progress from a sketch of your ideas, to a detailed algorithm, to an implementation.
  5. Generate test cases
  6. Test and debug