Lecture 10 — Lists Part 2

Topics

  • List aliasing, lists and functions
  • For loops to operate on lists
  • Slicing to create copies of lists and to create sublists
  • Converting back and forth between strings and lists

List Aliasing

  • Consider the following example Python code:

    >>> L1 = [ 'RPI', 'WPI', 'MIT' ]
    >>> L2 = L1
    >>> L3 = [ 'RPI', 'WPI', 'MIT' ]
    >>> L2.append( 'RIT' )
    >>> L2[1] = 'CalTech'
    >>> L1
    ['RPI', 'CalTech', 'MIT', 'RIT']
    >>> L2
    ['RPI', 'CalTech', 'MIT', 'RIT']
    >>> L3
    ['RPI', 'WPI', 'MIT']
    
  • Surprised? This is caused by the creation of what we call an alias in computer science:

    • L1 and L2 reference the same list - they are aliases of each other and the underlying list - so changes made using either name change the underlying list
    • L3 references a different list that just happens to have the same string values in the same order: there would have been no confusion if the names had been different.
    • We’ll use our memory model for lists to understand what is happening here.
  • Python uses aliases for reasons of efficiency: lists can be quite long and are frequently changed, so copying of entire lists is expensive

  • This is true for other container data types as well.

    • Assignments create an alias for images, lists, tuples, strings and, as we will see later, sets and dictionaries
      • Strings and tuples are special because they can’t be changed without changing them entirely.
  • If we truly want to copy a list rather than alias it, we should use a special set of methods discussed below.

Aliasing and Function Parameters

  • When variables are passed to functions, a copy of their value is created for numbers and booleans:

    def add_two(val1, val2):
        val1 += val2
        return val1
    
    val1 = 10
    val2 = 15
    print val1, val2
    print add_two(val1,val2)
    print val1, val2
    
  • When lists are passed to functions, the parameter becomes an alias for the argument in the function call.

    • Formally in computer science, this is known as pass by reference.
  • Here is an example of a function that returns a list containing the two smallest values in its input list:

    def smallest_two(mylist):
         mylist.sort()
         newlist = []
         if len(mylist) > 0:
             newlist.append(mylist[0])
             if len(mylist) > 1:
                 newlist.append(mylist[1])
         return newlist
    
    values = [35, 34, 20, 40, 60, 30]
    
    print "Before function:", values
    print "Result of function:", smallest_two(values)
    print "After function:", values
    
  • In class we will discuss what happened

What Operations Change a List? What Operations Create New Lists?

  • Operations that change lists include
    • sort, insert, append, pop, remove
  • Operations that create new lists
    • Slicing (discussed below), concatenation (+), replication (*) and list()

Part 1 Exercises

  1. What is the output of the following code?

    L1 = [ 'Kentucky', 'Vermont', 'New York' ]
    L2 = L1
    L1.append( 'Ohio' )
    L1 = [ 'Kentucky', 'Ohio' ]
    L3 = L1
    L1.pop()
    print L1
    print L2
    print L3
    
  2. Write a function to capitalize all the names in a list passed as an argument by returning a new list that contains the capitalized values. Does the original list change?

  3. Show the output of the following two for loops and explain the difference (in terms of list aliasing)

    mylist = [2,8,11]
    for item in mylist:
        item *= 2
    print mylist
    
    mylist2 = [ [2], [8], [11] ]
    for item in mylist2:
        item[0] *= 2
    print mylist2
    

Part 2: For Loops and Operations on List Items

  • A common operation in programs is to go through every element in a list using a loop. While loops can be used for this, but a Python for loop simplifies this operation significantly.

  • Let’s first go through all the animals in a list, and print them out.

    animals = ['cat', 'monkey', 'hawk', 'tiger', 'parrot']
    all_animals = ""
    for animal in animals:
        all_animals.append( animal.capitalize() )
    print all_animals
    
  • We can understand what is happening by looking at this piece-by-piece:

    • The keyword for signals the start of a loop
    • animal is a loop variable that takes on the value of each item in the list (as indicated by the keyword in) in succession
      • This is called iterating over the values/elements of the lsit
    • The : signals the start of a block of code that is the “body of the loop”, executed once in succession for each value that animal is assigned
    • The body of the loop here is just a single, indented line of code, but in other cases there may be many more lines of code.
    • The end of the loop body is indicated by returning to the same level of the indentation as the for ... line that started the loop.
  • A word of caution: Do not iterate over list elements with a for loop to change the elements, unless the elements are containers themselves.

Using range

  • So, based on what we know so far, if we wanted to capitalize all of the strings in a list, without creating a new list we would need to use a while loop with a separate index variable - often using the variable name i or j:

    animals = ['cat', 'monkey', 'hawk', 'tiger', 'parrot']
    i = 0
    while i < len(animals):
        animals[i] = animals[i].capitalize()
        i += 1
    print animals
    
  • We can simplify this a bit using a range to create a list of indices:

    >>> range(len(animals))
    [0, 1, 2, 3, 4]
    

    and then using a for loop that iterates over the values in this list:

    animals = ['cat', 'monkey', 'hawk', 'tiger', 'parrot']
    for i in range(len(animals))
        animals[i] = animals[i].capitalize()
    print animals
    
  • There is no need to write code to compare our index / counter directly against the bound and no need to write code to increment its value.

  • This use of range to generate an index list is common

    • When we want to change the integer, float or string values of a list.
    • When we want to work with multiple lists at once.

Part 2 Exercises

  1. Recall our list

    co2_levels = [ 320.03, 322.16, 328.07, 333.91, 341.47, \
      348.92, 357.29, 363.77, 371.51, 382.47, 392.95 ]
    

    For the purpose of this exercise only, please pretend the Python sum function does not exist, and then write a short section of Python code to first compute and then print the sum of the values in the co2_levels list. You do not need to use indexing.

  2. Write a Python function that is passed the co2_levels variable as an argument, and returns the number of values that are greater than the average value. For this you may use Python’s sum and len functions as part of your solution. Again, you do not need to use indexing.

  3. Suppose we discovered that the measurement of CO2 values was uniformly too low by a small fraction p. Write a function that increases each value in co2_levels by the fraction p. For this problem you need to use a range and indexing.

Using Indices to “Slice” a List and Create a New List

  • Recall

    >>> co2_levels = [ 320.03, 322.16, 328.07, 333.91, 341.47,
          348.92, 357.29, 363.77, 371.51, 382.47, 392.95 ]
    
  • Now suppose we just want the values at indices 2, 3 and 4 of this in a new list:

    >>> three_values = co2_levels[2:5]
    >>> three_values
    [328.07, 333.91, 341.47]
    >>> co2_levels
    [ 320.03, 322.16, 328.07, 333.91, 341.47, 348.92, 357.29, 363.77,
       371.51, 382.47, 392.95 ]
    
  • We give the first index and one more than the last index we want

  • If we leave off the first index, 0 is assumed, and if we leave off the last index, the length of the list is assumed.

  • Negative indices are allowed — they are just converted to their associated positive values. Some examples:

    >>> L1
    ['cat', 'dog', 'hawk', 'tiger', 'parrot']
    >>> L1[1:-1]
    ['dog', 'hawk', 'tiger']
    >>> L1[1:-2]
    ['dog', 'hawk']
    >>> L1[1:-4]
    []
    >>> L1[1:0]
    []
    >>> L1[1:10]
    ['dog', 'hawk', 'tiger', 'parrot']
    

More on List Slicing

  • The most general form of slicing involves three values

    L[si:ei:inc]
    

    where

    • L is the list
    • si is the start index
    • ei is the end index
    • inc is the increment value

    Any of the three values is optional

  • We’ll work through some examples in class to

    • Use slicing to copy an entire list
    • Use negative increments and generate a reversed list
    • Extracting the even indexed values
  • Note: L[:] returns a copy of the whole list of L. This is the same using function list(L):

    >>> L2 = L1[:]
    >>> L2[1] = 'monkey'
    >>> L1
    ['cat', 'dog', 'hawk', 'tiger', 'parrot']
    >>> L2
    ['cat', 'monkey', 'hawk', 'tiger', 'parrot']
    >>> L3 = list(L1)
    >>> L3[1] = 'turtle'
    >>> L1
    ['cat', 'dog', 'hawk', 'tiger', 'parrot']
    >>> L2
    ['cat', 'monkey', 'hawk', 'tiger', 'parrot']
    >>> L3
    ['cat', 'turtle', 'hawk', 'tiger', 'parrot']
    

Concatentation and Replication

  • Concatenation:

    >>> v = [1,2,3]+[4,5]
    >>> v
    [1,2,3,4,5]
    
  • Replication:

    >>> [1]*3
    [1,1,1]
    
  • These are very similar to the analogous operations with strings.

Part 3 Exercises

  1. What is the output of the following?

    x = [6,5,4,3,2,1] + [7]*2
    y = x
    x[1] = y[2]
    y[2] = x[3]
    x[0] = x[1]
    print x
    
    y.sort()
    print x
    print y
    
  2. Write a command to extract values from a list L0 indexed by 0,1,4,7 and 10,11, and return a list containing only these values. No slicing is needed.

  3. Write a slicing command to extract values from a list L0 indexed by 1, 4, 7, 10, etc.

Converting Strings to Lists

  • Version 1: use the function list to create a list of the characters in the string:

    >>> s = "Hello world"
    >>> t = list(s)
    >>> print t
    ['H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
    
  • Version 2: use the string split function, which breaks a string up into a list of strings based on the character provided as the argument.

    • The default is ' ':
    • Other common splitting characters are ',', '|' and '\t'
  • We will play with the s = "Hello world" example in class.

Converting Lists to Strings

  • What happens when we type the following?

    >>> s = "Hello world"
    >>> t = list(s)
    >>> s1 = str(t)
    

    This is will not concatenate all the strings in the list (assumming they are strings).

  • We can write a for loop to do this, but Python provides something simpler that works:

    >>> L1 = [ 'No', 'one', 'expects', 'the', 'Spanish', 'Inquisition' ]
    >>> print ''.join(L1)
    NooneexpectstheSpanishInquisition
    >>> print ' '.join(L1)
    No one expects the Spanish Inquisition
    

    Can you infer from this the role of the string that the join funciton is applied to?

Indexing and Slicing Strings

  • We can index strings:

    >>> s = "Hello, world!"
    >>> print s[5]
    ,
    >>> print s[-1]
    !
    
  • We can apply all of the slicing operations to strings to create new strings:

    >>> s = "Hello, world!"
    >>> s[:len(s):2]
    'Hlo ol!'
    
  • Unlike lists, however, we can not use indexing to replace individual characters in strings:

    >>> s[4] = 'c'
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'str' object does not support item assignment
    

Part 4 Exercises

  1. Given a list

    >>> L = [ 'cat', 'dog', 'tiger' ]
    

    write a line of code to append the string 'lion'

  2. Rewrite L so that it is a list of lists, with household pets in the 0th (sub)list, zoo animals in the first.

  3. How can you append an additional list of farm animals (e.g. 'horse', 'pig' and 'cow') to L.

  4. Write code to remove 'tiger' from the sublist of zoo animals.

  5. Suppose you have the string

    >>> s = "cat |  dog  | mouse"
    

    and you’d like to have the list of strings

    >>> L = [ "cat", "dog", "mouse"]
    

    Splitting the list alone does not solve the problem. Instead, you need to use a combination of splitting, and a loop that strips off the extra space characters from each string and appends to the final result. Write this code. It should be at most 4-5 lines of Python.

Summary

  • Assignment of lists and passing of lists as parameters creates aliases of lists rather than copies.
  • We use for loops to iterate through a list to work on each enty in the list.
  • We need to combine for loops with indices generated by a range in order to change the contents of a list of integers, floats or strings. These indices are also used to work with multiple lists at once.
  • Concatentation, replication and slicing create new lists.
  • Most other list functions that modify a list do so without creating a new list: insert, sort, append, pop, etc.
  • Strings may be indexed and sliced, but indexing may not be used to change a string.
  • Conversion of a string to a list is accomplished using either list or split; conversion of a list of strings to a string uses join.

Review Exercises: What Does Python Output?

  1. Without typing into the Python interpreter, find the outputs from the following operations:

    >>> x = ['a','b','c','d', 'e']
    >>> print x
    
    >>> for item in x:
    ...    print "*%s*" %item,
    ...
    
    >>> print x[3]
    
    >>> x[3] = 3
    >>> x
    
    >>> len(x)
    
    >>> x[2]=x[1]
    >>> x
    
    >>> x[5]
    
    >>> y = x[1:4]
    
    >>> y
    
    >>> x
    
  2. What about these operations?

    >>> y = [1, 2, 3]
    
    >>> y.append('cat')
    
    >>> y
    
    >>> y.pop()
    
    >>> y
    
    
    >>> y.remove(2)
    >>> y
    
    >>> y.remove('cat')
    
    >>> z = ['cat','dog']
    >>> z.insert(1,'pig')
    >>> z.insert(0,'ant')
    >>> z
    
    >>> z.sort()
    >>> z
    
    >>> z1 = z[1:3]
    >>> z1
    
    >>> z
    
  3. Write a function that returns a list containing the smallest and largest values in the list that is passed to it as an argument without changing the list? Can you think of several ways to do this?

    1. Using min and max
    2. Using sorting (but remember, you can’t change the original list)
    3. Using a for that searches for the smallest and largest values.