Lists

We’ll look at list from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods.

List Semantics

A list is a container for variable length sequence of Python objects. A list is mutable, which means that items within the list can be changed. Also, items can be added to the list or removed from the list.

Since a list is a sequence, all of the common operations to sequences apply.

Sometimes we’ll see a list with a fixed number of elements, like a two-dimensional point with two elements, x and y. A fixed-length list may not be the right choice; a tuple, covered in Tuples is usually better for static sequences of elements.

A great deal of Python’s internals are list -based. The for statement, in particular, expects a sequence, and we often create a list by using the range() function. When we split a string using the split() method, we get a list of substrings.

List Literal Values

A list literal is created by surrounding objects with [] and separating the items with commas (,). A list can be created, expanded and reduced. An empty list is simply []. As with tuple, an extra comma at the end of the list is gracefully ignored.

Examples:

myList = [ 2, 3, 4, 9, 10, 11, 12 ]
history = [ ]

The elements of a list do not have to be the same type. A list can be a mixture of any Python data types, including lists, tuples, strings and numeric types.

A list permits a sophisticated kind of display called a comprehension. We’ll revisit this in some depth in List Comprehensions. As a teaser, consider the following:

>>> [ 2*i+1 for i in range(6) ]
[1, 3, 5, 7, 9, 11]

This statement creates a list using a list comprehension. A comprehension starts with a candidate list ( range(6), in this example) and derives the list values from the candidate list (using 2*i+1 in this example). A great deal of power is available in comprehensions, but we’ll save the details for a later section.

List Operations

The three standard sequence operations (+, *, []) can be performed with list, as well as other sequences like tuple and string.

The + operator creates a new list as the concatenation of the arguments.

>>> ["field"] + [2, 3, 4] + [9, 10, 11, 12]
['field', 2, 3, 4, 9, 10, 11, 12]

The * operator between list and numbers (number * list or list * number) creates a new list that is a number of repetitions of the input list.

>>> 2*["pass","don't","pass"]
['pass', "don't", 'pass', 'pass', "don't", 'pass']

The [] operator selects an character or a slice from the list. There are two forms: the single-item form and the slice form.

  • The single item format is list [ index ]. Items are numbered from 0 to len(list). Items are also numbered in reverse from -len(list) to -1.

  • The slice format is list [ start : end ]. Items from start to end -1 are chosen to create a new list as a slice of the original list; there will be end-start items in the resulting list.

    If start is omitted it is the beginning of the list (position 0).

    If end is omitted it is the end of the list (position -1).

    Yes, you can omit both (someList[:]) to make a copy of a list. This is a shallow copy: the original objects are now members of two distinct lists.

In the following example, we’ve constructed a list where each element is a tuple. Each tuple could be a pair of dice.

>>> l=[(6, 2), (5, 4), (2, 2), (1, 3), (6, 5), (1, 4)]
>>> l[2]
(2, 2)
>>> l[:3]
[(6, 2), (5, 4), (2, 2)]
>>> l[3:]
[(1, 3), (6, 5), (1, 4)]
>>> l[-1]
(1, 4)
>>> l[-3:]
[(1, 3), (6, 5), (1, 4)]

List Comparison Operations

The standard comparisons (<, <=, >, >=, ==, !=, in, not in) work exactly the same among list, tuple and string sequences. The list items are compared element by element. If the corresponding elements are the same type, ordinary comparison rules are used. If the corresponding elements are different types, the type names are compared, since there is no other rational basis for comparison.

d1= random.randrange(6)+1
d2= random.randrange(6)+1
if d1+d2 in [2, 12] + [3, 4, 9, 10, 11]:
    print "field bet wins on ", d1+d2
else:
    print "field bet loses on ", d1+d2

This will create two random numbers, simulating a roll of dice. If the number is in the list of field bets, this is printed. Note that we assemble the final list of field bets from two other list objects. In a larger application program, we might separate the different winner list instances based on different payout odds.

List Statements

There are a number of statements that have specific features related to list objects.

The Assignment Statement. The variation on the assignment statement called multiple-assignment statement also works with lists. We looked at this in Multiple Assignment Statement. Multiple variables are set by decomposing the items in the list.

>>> x, y = [ 1, "hi" ]
>>> x
1
>>> y
'hi'

This will only work of the list has a fixed and known number of elements. This is more typical when working with a tuple, which is immutable, rather than a list, which can vary in length.

The for Statement. The for statement will step though all elements of a sequence.

s= 0
for i in [2,3,5,7,11,13,17,19]:
    s += i
print "total",s

When we introduced the for statement in Iterative Processing: The for Statement, we showed the range() function; this function creates a list. We can also create a list with a literal or comprehension. We’ve looked at simple literals above. We’ll look at comprehensions below.

The del Statement. The del statement removes items from a list. For example

>>> i = range(10)
>>> del i[0], i[2], i[4], i[6]
>>> i
[1, 2, 4, 5, 7, 8]

This example reveals how the del statement works.

The i variable starts as the list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ].

  1. Remove i[0] and the variable is [1, 2, 3, 4, 5, 6, 7, 8, 9].
  2. Remove i[2] (the value 3) from this new list , and get [1, 2, 4, 5, 6, 7, 8, 9].
  3. Remove i[4] (the value 6) from this new list and get [1, 2, 4, 5, 7, 8, 9].
  4. Finally, remove i[6] and get [1, 2, 4, 5, 7, 8].

List Built-in Functions

The list() function creates a list out of another sequence object.

list(sequence) → list
Create a list from another sequence. This will convert tuple or str to a list.

Functions which apply to tuples, but are defined elsewhere.

  • len(). For lists, this function returns the number of items.

    >>> len( [1,1,2,3] )
    4
    >>> len( [] )
    0
    
  • max(). For lists, this function returns the maximum item.

    >>> max( [1,9973,2] )
    9973
    
  • min(). For lists, this function returns the minimum item.

  • sum(). For lists, this function sums the individual items.

    >>> sum( [1,9973,2] )
    9976
    
  • any(). For lists, Return True if there exists any item which is True.

    >>> any( [0,None,False] )
    False
    >>> any( [0,None,False,42] )
    True
    >>> any( [1,True] )
    True
    
  • all(). For lists, Return True if all items are True.

    >>> all( [0,None,False,42] )
    False
    >>> all( [1,True] )
    True
    
  • enumerate(). Iterate through the list returning 2-tuples of ( index, item ).

    In effect, this function “enumerates” all the items in a sequence: it provides a number and each element of the original sequence in a 2-tuple.

    for i, x in someList:
        print "position", i, " has value ", x
    

    Consider the following list of tuples.

    >>> a = [ ("pi",3.1415946),("e",2.718281828),("mol",6.02E23) ]
    >>> list( enumerate( a ) )
    [(0, ('pi', 3.1415945999999999)), (1, ('e', 2.7182818279999998)
    02e+23))]
    >>> for i, t in enumerate( a ):
    ...     print "item",i,"is",t
    ...
    item 0 is ('pi', 3.1415945999999999)
    item 1 is ('e', 2.7182818279999998)
    item 2 is ('mol', 6.02e+23)
    
  • sorted(). Iterate through the list in sorted order.

    >>> list( sorted( [9,1,8,2,7,3] ))
    [1, 2, 3, 7, 8, 9]
    >>> tuple( sorted( [9,1,8,2,7,3], reverse=True ))
    [9, 8, 7, 3, 2, 1]
    
  • reversed(). Iterate through the list in reverse order.

    >>> tuple( reversed( [9,1,8,2,7,3] ) )
    [3, 7, 2, 8, 1, 9]
    

The following function returns a list.

range([start], stop[, step]) → list

The arguments must be plain integers. If the step argument is omitted, it defaults to 1. If the start argument is omitted, it defaults to 0. step must not be zero (or else ValueError is raised).

The full form returns a list of plain integers [ start, start+step, start + 2 \times step, ...].

If step is positive, the last element is the largest start + i \times step < stop; ; if step is negative, the last element is the largest start + i \times step > stop.

List Methods

A list object has a number of member methods. These can be grouped arbitrarily into mutators, which change the list, transformers which create something new from the list, and and accessors, which returns a fact about a list.

The following list mutators update a list object. Generally, these do not return a value.

In the case of the pop() method, it both returns information as well as mutates the list.

list.append(object)
Update list by appending object to end of the list.
list.extend(list)
Extend list by appending list elements. Note the difference from append() , which treats the argument as a single list object.
list.insert(index, object)
Update list l by inserting object before position index. If index is greater than len(list), the object is simply appended. If index is less than zero, the object is prepended.
list.pop([index=-1]) → item
Remove and return item at index (default last, -1) in list. An exception is raised if the list is already empty.
list.remove(value)
Remove first occurrence of value from list. An exception is raised if the value is not in the list.
list.reverse()
Reverse the items of the list. This is done “in place”, it does not create a new list. There is no return value.
list.sort([key][, reverse=False])

Sort the items of the list. This is done “in place”, it does not create a new list.

If the reverse keyword parameter is provided and set to True, the tuple is sorted into descending order.

The key parameter is used when the items in the tuple aren’t simply sorted using the default comparison operators. The key function must return the fields to be compared selected from the underlying items in the tuple.

We’ll look at this in detail in Functional Programming with Collections.

The following accessor methods provide information about a list.

count(value) → integer
Return number of occurrences of value in list.
index(value) → integer
Return index of first occurrence of value in list.

Stacks and Queues. The list.append() and list.pop() functions can be used to create a standard push-down stack, or last-in-first-out (LIFO) list. The append() method places an item at the end of the list (or top of the stack), where the pop() method can remove it and return it.

>>> stack = []
>>> stack.append(1)
>>> stack.append( "word" )
>>> stack.append( ("a","2-tuple") )
>>> stack.pop()
('a', '2-tuple')
>>> stack.pop()
'word'
>>> stack.pop()
1
>>> len(stack)
0
>>> stack.pop()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: pop from empty list

The list.append() and list.pop() functions can be used to create a standard queue, or first-in-first-out (FIFO) list. The append() method places an item at the end of the queue. A call to pop(0) removes the first item from the queue and returns it.

>>> queue = []
>>> queue.append( 1 )
>>> queue.append( "word" )
>>> queue.append( ("a","2-tuple") )
>>> queue.pop(0)
1
>>> queue.pop(0)
'word'
>>> queue.pop(0)
('a', '2-tuple')
>>> len(queue)
0
>>> queue.pop(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: pop from empty list

Using Lists as Function Parameter Defaults

It’s very, very important to note that default values must be immutable objects. Recall that numbers, strings, None, and tuple objects are immutable.

We note that lists as well as sets and dictionaries are mutable, and cannot be used as default values for function parameters.

Consider the following example of what not to do.

>>> def append2( someList=[] ):
...     someList.append(2)
...     return someList
...
>>> looks_good= []
>>> append2(looks_good)
[2]
>>> append2(looks_good)
[2, 2]
>>> looks_good
[2, 2]
>>>
>>>
>>> not_good= append2()
>>> not_good
[2]
>>> worse= append2()
>>> worse
[2, 2]
>>> not_good
[2, 2]
  1. We defined a function which has a default value that’s a mutable object. This is simple a bad programming practice in Python.

  2. We used this function with a list object, looks_good. The function updated the list object as expected.

  3. We used the function’s default value to create not_good. The function appended to an empty list and returned this new list object.

    It turns out that the function updated the mutable default value, also.

  4. When we use the function’s default value again, with worse, the function uses the updated default value and updates it again.

    Both not_good and worse are references to the same mutable object that is being updated.

To avoid this, do not use mutable values as defaults. Do this instead.

def append2( someList=None ):
    if someList is None:
        someList= []
    someList.append(2)
    return someList

This creates a fresh new mutable object as needed.

List Exercises

  1. Accumulating Distinct Values. This uses the Bounded Linear Search algorithm to locate duplicate values in a sequence. This is a powerful technique to eliminate sorting from a wide variety of summary-type reports. Failure to use this algorithm leads to excessive processing in many types of applications.

    Distinct Values of a Sequence, seq

    1. Initialize Distinct Values. Set dv \gets \operatorname{list}().

    2. Loop. For each value, v, in seq.

      We’ll use the Bounded Linear Search to see if v occurs in dv.

      1. Initialize. Set i \gets 0.

        Append v to the list dv.

      2. Search. while dv[i] \neq v: increment i.

        At this point dv[i] = v. The question is whether i = \operatorname{len}(dv) or not.

      3. New Value?. if i = \operatorname{len}(dv): v is distinct.

      4. Existing Value?. if i \neq \operatorname{len}(dv): v is a duplicate of dv[i].

        Delete dv[-1], the value we added.

    3. Result. Return array dv, which has distinct values from seq.

    You may also notice that this fancy Bounded Linear Search is suspiciously similar to the index() method function of a list. Rewrite this using uniq.index instead of the Bounded Linear Search in step 2.

    When we look the set collection, you’ll see another way to tackle this problem.

  2. Binary Search. This is not as universally useful as the Bounded Linear Search (above) because it requires the data be sorted.

    Binary Search a sorted Sequence, seq, for a target value, tgt

    1. Initialize. l, h \gets 0, \operatorname{len}(seq).

      m \gets (l + h) \div 2. This is the midpoint of the sorted sequence.

    2. Divide and Conquer. While l+1 < h \text{ \bf and } seq[m] \neq tgt.

      If tgt < seq[m]: h \gets m. Move h to the midpoint.

      If tgt > seq[m]: l \gets m+1. Move l to the midpoint.

      m \gets (l + h) \div 2. Compute a midpoint of the new, smaller sequence.

    3. Result. If tgt = seq[m]: return m

      If tgt \neq seq[m]: return -1 as a code for “not found”.

  3. Quicksort. The super-fast sort routine

    As a series of loops it is rather complex. As a recursion it is quite short. This is the same basic algorithm in the C libraries.

    Quicksort proceeds by partitioning the list into two regions: one has all of the high values, the other has all the low values. Each of these regions is then individually sorted into order using the quicksort algorithm. This means the each region will be subdivided and sorted.

    For now, we’ll sort an array of simple numbers. Later, we can generalize this to sort generic objects.

    Quicksort a List, a between elements lo and hi

    1. Partition

      1. Initialize. ls, hs \gets lo, hi. Setup for partitioning between ls and hs.

        middle \gets ( ls + hs ) \div 2.

      2. Swap To Partition. while ls < hs:

        If a[ls].key \le a[middle].key: increment ls by 1. Move the low boundary of the partitioning.

        If a[ls].key > a[middle].key: swap the values a[ls] \leftrightarrows a[middle].

        If a[hs].key \ge a[middle].key: decrement hs by 1. Move the high boundary of the partitioning.

        If a[hs].key < a[middle].key:, swap the values a[hs] \leftrightarrows a[middle].

    2. Quicksort Each Partition.

      QuickSort( a , lo, middle )

      QuickSort( a , middle+1, hi )

  4. Recursive Search. This is also a binary search: it works using a design called “divide and conquer”. Rather than search the whole list, we divide it in half and search just half the list. This version, however is defined with a recusive function instead of a loop. This can often be faster than the looping version shown above.

    Recursive Search a List, seq for a target, tgt, in the region between elements lo and hi.

    1. Empty Region? If lo+1 \ge hi: return -1 as a code for “not found”.
    2. Middle Element. m \gets ( lo + hi ) \div 2.
    3. Found? If seq[m] = tgt: return m.
    4. Lower Half? If seq[m] < tgt: return recursiveSearch ( seq, tgt, lo, m )
    5. Upper Half? If seq[m] > tgt: return recursiveSearch( seq, tgt, m+1, hi )
  5. Sieve of Eratosthenes. This is an algorithm which locates prime numbers. A prime number can only be divided evenly by 1 and itself. We locate primes by making a table of all numbers, and then crossing out the numbers which are multiples of other numbers. What is left must be prime.

    Sieve of Eratosthenes

    1. Initialize. Create a list, prime of 5000 booleans, all True, initially.

      p \gets 2.

    2. Iterate. While 2 \leq p < 5000.

      1. Find Next Prime. While \text{\bf not } prime[p] \text{ \bf and } 2 \le p < 5000:

        Increment p by 1.

      2. Remove Multiples. At this point, p is prime.

        Set k \gets p + p.

        while k < 5000.

        prime[k] \gets False.

        Increment k by p.

      3. Next p. Increment p by 1.

    3. Report. At this point, for all p, if prime [ p ] is true, p is prime.

      while 2 \leq p < 5000:

      if prime[p]: print p

    The reporting step is a “filter” operation. We’re creating a list from a source range and a filter rule. This is ideal for a list comprehension. We’ll look at these in List Comprehensions.

    Formally, we can say that the primes are the set of values defined by primes = \lbrace p \vert_{0 \leq p < 5000} \text{ \bf if } prime_p \rbrace. This formalism looks a little bit like a list comprehension.

  6. Polynomial Arithmetic. We can represent numbers as polynomials. We can represent polynomials as arrays of their coefficients. This is covered in detail in [Knuth73], section 2.2.4 algorithms A and M.

    Example: 4x^3 + 3x + 1 has the following coefficients: ( 4, 0, 3, 1 ).

    The polynomial 2x^2 - 3x - 4 is represented as ( 2, -3, -4 ).

    The sum of these is 4x^3 + 2x^2- 3; ( 4, 2, 0, -3 ).

    The product these is 8x^5 - 12x^4 - 10x^3 - 7x^2 - 15x - 4; ( 8, -12, -10, -7, -15, -4 ).

    You can apply this to large decimal numbers. In this case, x is 10, and the coefficients must all be between 0 and x-1. For example, 1987 = 1x^3 + 9x^2 + 8x + 7, \text{ when } x = 10.

    Add Polynomials, p, q

    1. Result Size. r_{sz} \gets \text{ the larger of } \operatorname{len}(p) \text{ and } \operatorname{len}(q).

    2. Pad P? If \operatorname{len}(p) < r_{sz}:

      Set p1 to a tuple of r_{sz} - \operatorname{len}(p) zeros + p.

      Else: Set p1 to p.

    3. Pad Q? If \operatorname{len}(q) < r_{sz}:

      Set q1 t a tuple of r_{sz} - \operatorname{len}(q) zeroes + q.

      Else, Set q1 to q.

    4. Add. Add matching elements from p1 and q1 to create result, r.

    5. Result. Return r as the sum of p and q.

    Multiply Polynomials, x, y

    1. Result Size. r_{sz} \gets \operatorname{len}(x) + \operatorname{len}(y).

      Initialize the result list, r, to all zeros, with a size of r_{sz}.

    2. For all elements of x. while 0 \leq i < \operatorname{len}(x):

      For all elements of y. while 0 \leq j < \operatorname{len}(y):

      Set r[i+j] = r[i+j] + x[i] \times y[j].

    3. Result. Return a tuple made from r as the product of x and y.

  7. Random Number Evaluation. Before using a new random number generator, it is wise to evaluate the degree of randomness in the numbers produced. A variety of clever algorithms look for certain types of expected distributions of numbers, pairs, triples, etc. This is one of many random number tests.

    Use random.random() to generate an array of random samples. These numbers will be uniform over the interval 0..1

    Distribution test of a sequence of random samples, U

    1. Initialize. Initialize count to a list of 10 zeros.
    2. Examine Samples. For each sample value, v, in the original set of 1000 random samples, U.
      1. Coerce Into Range. Set x \gets \lfloor v \times 10 \rfloor. Multiply by 10 and truncate and integer to get a a new value in the range 0 to 9.
      2. Count. Increment count [x] by 1.
    3. Report. We expect each count to be 1/10th of our available samples. We need to display the actual count and the % of the total. We also need to calculate the difference between the actual count and the expected count, and display this.
  8. Dutch National Flag. A challenging problem, one of the hardest in this set. This is from Edsger Dijkstra’s book, A Discipline of Programming [Dijkstra76].

    Imagine a board with a row of holes filled with red, white, and blue pegs. Develop an algorithm which will swap pegs to make three bands of red, white, and blue (like the Dutch flag). You must also satisfy this additional constraint: each peg must be examined exactly once.

    Without the additional constraint, this is a relatively simple sorting problem. The additional constraint requires that instead of a simple sort which passes over the data several times, we need a more clever sort.

    Hint: You will need four partitions in the array. Initially, every peg is in the “Unknown” partition. The other three partitions (“Red”, “White” and “Blue”) are empty. As the algorithm proceeds, pegs are swapped into the Red, White or Blue partition from the Unknown partition. When you are done, the unknown partition is reduced to zero elements, and the other three partitions have known numbers of elements.

Table Of Contents

Previous topic

Tuples

Next topic

Mappings and Dictionaries

This Page