Lecture 24 — Problem Solving and Design ========================================= Overview -------- This is the last "official" set of lecture notes. Material from these notes will be on the final. - Design: - Choice of container/data structure; choice of algorithm - Implementation - Testing - Debugging - We will discuss these in the context of several variations on one problem: - Finding the mode in a sequence of values — the value (or values) occuring most often. - Our discussion is loosely based on Chapter 12, but there are many things in this chapter we will skip: - Discussion of functions, default parameters, variable numbers of arguments - Exceptions - Specific design patterns - We will start with a completely blank slate so that the whole process unfolds from scratch. This includes looking for other code to adapt. Problem: Finding the Mode ------------------------- - Given a series of values, find the one that occurs most often. - Variation 1: is there a limited, indexable range of values? - Examples that are consistent with this variation include test scores or letters of the alphabet - Examples not consistent include counting words and counting amino acids - Variation 2: do we want just the modes or do we want to know how many times each value occurs? - Variation 3: do we want a histogram where values are grouped? - Example: ocean temperature measurements, pixel intensities, income values. - In each of these cases, a specific value, the number of occurrences of a specific ocean, such as 2.314C, is not really of interest. More important is the number of temperature values in certain ranges. Our Focus: A Sequence of Numbers -------------------------------- - Integers, such as test scores - Floats, such as temperature measurements Sequence of Discussion ---------------------- - Brainstorm ideas for the basic approach. We’ll come with at least three. - Algorithm / implementation - Testing - Generate test cases - Which test cases we generate will depend on the choice of algorithm. We will combine them. - Debugging: - If we find a failed test case, we will need to find the error and fix it. - Use a combination of carefully reading the code, working with a debugger, and generating print statements. - Evaluation: - Theoretical - Experimental timing Discussion of Variations ------------------------ - Frequency of occurrence: - What are the ten most frequently occurring values? What are the top ten percent most frequent values? - Output the occurrences for each value. - Clusters / histograms: - Test scores in each range of 10 - Quantiles: bottom 25% of scores, median, top 25% Practice Problems ----------------- - These will be generated after class based on how the discussion progresses. Summary of Problem Solving Steps -------------------------------- #. Understand the problem: play with examples, ask (yourself) questions, consider variations. #. Think about the core approach, algorithm(s) and data structure(s. - Several choices are often possible - The choice dictates the details of what follows. #. Narrow choices based on considerations of efficiency, clarity and ease of implementation. #. Gradually progress from a sketch of your ideas, to a detailed algorithm, to an implementation. #. Generate test cases #. Test and debug