Lecture 24 — Problem Solving and Design
=========================================
Overview
--------
This is the last "official" set of lecture notes. Material from these
notes will be on the final.
- Design:
- Choice of container/data structure; choice of algorithm
- Implementation
- Testing
- Debugging
- We will discuss these in the context of several variations on one
problem:
- Finding the mode in a sequence of values — the value (or values)
occuring most often.
- Our discussion is loosely based on Chapter 12, but there are many
things in this chapter we will skip:
- Discussion of functions, default parameters, variable numbers of
arguments
- Exceptions
- Specific design patterns
- We will start with a completely blank slate so that the whole process
unfolds from scratch. This includes looking for other code to adapt.
Problem: Finding the Mode
-------------------------
- Given a series of values, find the one that occurs most often.
- Variation 1: is there a limited, indexable range of values?
- Examples that are consistent with this variation include test
scores or letters of the alphabet
- Examples not consistent include counting words and counting amino
acids
- Variation 2: do we want just the modes or do we want to know how many
times each value occurs?
- Variation 3: do we want a histogram where values are grouped?
- Example: ocean temperature measurements, pixel intensities, income
values.
- In each of these cases, a specific value, the number of
occurrences of a specific ocean, such as 2.314C, is not really of
interest. More important is the number of temperature values in
certain ranges.
Our Focus: A Sequence of Numbers
--------------------------------
- Integers, such as test scores
- Floats, such as temperature measurements
Sequence of Discussion
----------------------
- Brainstorm ideas for the basic approach. We’ll come with at least
three.
- Algorithm / implementation
- Testing
- Generate test cases
- Which test cases we generate will depend on the choice of
algorithm. We will combine them.
- Debugging:
- If we find a failed test case, we will need to find the error and
fix it.
- Use a combination of carefully reading the code, working with a
debugger, and generating print statements.
- Evaluation:
- Theoretical
- Experimental timing
Discussion of Variations
------------------------
- Frequency of occurrence:
- What are the ten most frequently occurring values? What are the
top ten percent most frequent values?
- Output the occurrences for each value.
- Clusters / histograms:
- Test scores in each range of 10
- Quantiles: bottom 25% of scores, median, top 25%
Practice Problems
-----------------
- These will be generated after class based on how the discussion
progresses.
Summary of Problem Solving Steps
--------------------------------
#. Understand the problem: play with examples, ask (yourself) questions,
consider variations.
#. Think about the core approach, algorithm(s) and data structure(s.
- Several choices are often possible
- The choice dictates the details of what follows.
#. Narrow choices based on considerations of efficiency, clarity and
ease of implementation.
#. Gradually progress from a sketch of your ideas, to a detailed
algorithm, to an implementation.
#. Generate test cases
#. Test and debug