We’ll look at list from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods.
A list is a container for variable length sequence of Python objects. A list is mutable, which means that items within the list can be changed. Also, items can be added to the list or removed from the list.
Since a list is a sequence, all of the common operations to sequences apply.
Sometimes we’ll see a list with a fixed number of elements, like a two-dimensional point with two elements, x and y. A fixed-length list may not be the right choice; a tuple, covered in Tuples is usually better for static sequences of elements.
A great deal of Python’s internals are list -based. The for statement, in particular, expects a sequence, and we often create a list by using the range() function. When we split a string using the split() method, we get a list of substrings.
A list literal is created by surrounding objects with [] and separating the items with commas (,). A list can be created, expanded and reduced. An empty list is simply []. As with tuple, an extra comma at the end of the list is gracefully ignored.
Examples:
myList = [ 2, 3, 4, 9, 10, 11, 12 ]
history = [ ]
The elements of a list do not have to be the same type. A list can be a mixture of any Python data types, including lists, tuples, strings and numeric types.
A list permits a sophisticated kind of display called a comprehension. We’ll revisit this in some depth in List Comprehensions. As a teaser, consider the following:
>>> [ 2*i+1 for i in range(6) ]
[1, 3, 5, 7, 9, 11]
This statement creates a list using a list comprehension. A comprehension starts with a candidate list ( range(6), in this example) and derives the list values from the candidate list (using 2*i+1 in this example). A great deal of power is available in comprehensions, but we’ll save the details for a later section.
The three standard sequence operations (+, *, []) can be performed with list, as well as other sequences like tuple and string.
The + operator creates a new list as the concatenation of the arguments.
>>> ["field"] + [2, 3, 4] + [9, 10, 11, 12]
['field', 2, 3, 4, 9, 10, 11, 12]
The * operator between list and numbers (number * list or list * number) creates a new list that is a number of repetitions of the input list.
>>> 2*["pass","don't","pass"]
['pass', "don't", 'pass', 'pass', "don't", 'pass']
The [] operator selects an character or a slice from the list. There are two forms: the single-item form and the slice form.
The single item format is list [ index ]. Items are numbered from 0 to len(list). Items are also numbered in reverse from -len(list) to -1.
The slice format is list [ start : end ].
Items from start
to end -1 are chosen to create a new list
as a slice of the original list; there will be
items in the resulting list.
If start is omitted it is the beginning of the list (position 0).
If end is omitted it is the end of the list (position -1).
Yes, you can omit both (someList[:]) to make a copy of a list. This is a shallow copy: the original objects are now members of two distinct lists.
In the following example, we’ve constructed a list where each element is a tuple. Each tuple could be a pair of dice.
>>> l=[(6, 2), (5, 4), (2, 2), (1, 3), (6, 5), (1, 4)]
>>> l[2]
(2, 2)
>>> l[:3]
[(6, 2), (5, 4), (2, 2)]
>>> l[3:]
[(1, 3), (6, 5), (1, 4)]
>>> l[-1]
(1, 4)
>>> l[-3:]
[(1, 3), (6, 5), (1, 4)]
The standard comparisons (<, <=, >, >=, ==, !=, in, not in) work exactly the same among list, tuple and string sequences. The list items are compared element by element. If the corresponding elements are the same type, ordinary comparison rules are used. If the corresponding elements are different types, the type names are compared, since there is no other rational basis for comparison.
d1= random.randrange(6)+1
d2= random.randrange(6)+1
if d1+d2 in [2, 12] + [3, 4, 9, 10, 11]:
print "field bet wins on ", d1+d2
else:
print "field bet loses on ", d1+d2
This will create two random numbers, simulating a roll of dice. If the number is in the list of field bets, this is printed. Note that we assemble the final list of field bets from two other list objects. In a larger application program, we might separate the different winner list instances based on different payout odds.
There are a number of statements that have specific features related to list objects.
The Assignment Statement. The variation on the assignment statement called multiple-assignment statement also works with lists. We looked at this in Multiple Assignment Statement. Multiple variables are set by decomposing the items in the list.
>>> x, y = [ 1, "hi" ]
>>> x
1
>>> y
'hi'
This will only work of the list has a fixed and known number of elements. This is more typical when working with a tuple, which is immutable, rather than a list, which can vary in length.
The for Statement. The for statement will step though all elements of a sequence.
s= 0
for i in [2,3,5,7,11,13,17,19]:
s += i
print "total",s
When we introduced the for statement in Iterative Processing: The for Statement, we showed the range() function; this function creates a list. We can also create a list with a literal or comprehension. We’ve looked at simple literals above. We’ll look at comprehensions below.
The del Statement. The del statement removes items from a list. For example
>>> i = range(10)
>>> del i[0], i[2], i[4], i[6]
>>> i
[1, 2, 4, 5, 7, 8]
This example reveals how the del statement works.
The i variable starts as the list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ].
The list() function creates a list out of another sequence object.
Functions which apply to tuples, but are defined elsewhere.
len(). For lists, this function returns the number of items.
>>> len( [1,1,2,3] )
4
>>> len( [] )
0
max(). For lists, this function returns the maximum item.
>>> max( [1,9973,2] )
9973
min(). For lists, this function returns the minimum item.
sum(). For lists, this function sums the individual items.
>>> sum( [1,9973,2] )
9976
any(). For lists, Return True if there exists any item which is True.
>>> any( [0,None,False] )
False
>>> any( [0,None,False,42] )
True
>>> any( [1,True] )
True
all(). For lists, Return True if all items are True.
>>> all( [0,None,False,42] )
False
>>> all( [1,True] )
True
enumerate(). Iterate through the list returning 2-tuples of ( index, item ).
In effect, this function “enumerates” all the items in a sequence: it provides a number and each element of the original sequence in a 2-tuple.
for i, x in someList:
print "position", i, " has value ", x
Consider the following list of tuples.
>>> a = [ ("pi",3.1415946),("e",2.718281828),("mol",6.02E23) ]
>>> list( enumerate( a ) )
[(0, ('pi', 3.1415945999999999)), (1, ('e', 2.7182818279999998)
02e+23))]
>>> for i, t in enumerate( a ):
... print "item",i,"is",t
...
item 0 is ('pi', 3.1415945999999999)
item 1 is ('e', 2.7182818279999998)
item 2 is ('mol', 6.02e+23)
sorted(). Iterate through the list in sorted order.
>>> list( sorted( [9,1,8,2,7,3] ))
[1, 2, 3, 7, 8, 9]
>>> tuple( sorted( [9,1,8,2,7,3], reverse=True ))
[9, 8, 7, 3, 2, 1]
reversed(). Iterate through the list in reverse order.
>>> tuple( reversed( [9,1,8,2,7,3] ) )
[3, 7, 2, 8, 1, 9]
The following function returns a list.
The arguments must be plain integers. If the step argument is omitted, it defaults to 1. If the start argument is omitted, it defaults to 0. step must not be zero (or else ValueError is raised).
The full form returns a list of plain integers [ start,
,
, ...].
If step is positive, the last element is the largest
;
; if step is negative, the last element is the largest
.
A list object has a number of member methods. These can be grouped arbitrarily into mutators, which change the list, transformers which create something new from the list, and and accessors, which returns a fact about a list.
The following list mutators update a list object. Generally, these do not return a value.
In the case of the pop() method, it both returns information as well as mutates the list.
Sort the items of the list. This is done “in place”, it does not create a new list.
If the reverse keyword parameter is provided and set to True, the tuple is sorted into descending order.
The key parameter is used when the items in the tuple aren’t simply sorted using the default comparison operators. The key function must return the fields to be compared selected from the underlying items in the tuple.
We’ll look at this in detail in Functional Programming with Collections.
The following accessor methods provide information about a list.
Stacks and Queues. The list.append() and list.pop() functions can be used to create a standard push-down stack, or last-in-first-out (LIFO) list. The append() method places an item at the end of the list (or top of the stack), where the pop() method can remove it and return it.
>>> stack = []
>>> stack.append(1)
>>> stack.append( "word" )
>>> stack.append( ("a","2-tuple") )
>>> stack.pop()
('a', '2-tuple')
>>> stack.pop()
'word'
>>> stack.pop()
1
>>> len(stack)
0
>>> stack.pop()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: pop from empty list
The list.append() and list.pop() functions can be used to create a standard queue, or first-in-first-out (FIFO) list. The append() method places an item at the end of the queue. A call to pop(0) removes the first item from the queue and returns it.
>>> queue = []
>>> queue.append( 1 )
>>> queue.append( "word" )
>>> queue.append( ("a","2-tuple") )
>>> queue.pop(0)
1
>>> queue.pop(0)
'word'
>>> queue.pop(0)
('a', '2-tuple')
>>> len(queue)
0
>>> queue.pop(0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: pop from empty list
It’s very, very important to note that default values must be immutable objects. Recall that numbers, strings, None, and tuple objects are immutable.
We note that lists as well as sets and dictionaries are mutable, and cannot be used as default values for function parameters.
Consider the following example of what not to do.
>>> def append2( someList=[] ):
... someList.append(2)
... return someList
...
>>> looks_good= []
>>> append2(looks_good)
[2]
>>> append2(looks_good)
[2, 2]
>>> looks_good
[2, 2]
>>>
>>>
>>> not_good= append2()
>>> not_good
[2]
>>> worse= append2()
>>> worse
[2, 2]
>>> not_good
[2, 2]
We defined a function which has a default value that’s a mutable object. This is simple a bad programming practice in Python.
We used this function with a list object, looks_good. The function updated the list object as expected.
We used the function’s default value to create not_good. The function appended to an empty list and returned this new list object.
It turns out that the function updated the mutable default value, also.
When we use the function’s default value again, with worse, the function uses the updated default value and updates it again.
Both not_good and worse are references to the same mutable object that is being updated.
To avoid this, do not use mutable values as defaults. Do this instead.
def append2( someList=None ):
if someList is None:
someList= []
someList.append(2)
return someList
This creates a fresh new mutable object as needed.
Accumulating Distinct Values. This uses the Bounded Linear Search algorithm to locate duplicate values in a sequence. This is a powerful technique to eliminate sorting from a wide variety of summary-type reports. Failure to use this algorithm leads to excessive processing in many types of applications.
Distinct Values of a Sequence, seq
Initialize Distinct Values.
Set
.
Loop. For each value, v, in seq.
We’ll use the Bounded Linear Search to see if v occurs in dv.
Initialize.
Set
.
Append v to the list dv.
Search.
while
: increment i.
At this point
.
The question is whether
or not.
New Value?.
if
: v is distinct.
Existing Value?.
if
: v is a duplicate of
.
Delete
, the value we added.
Result. Return array dv, which has distinct values from seq.
You may also notice that this fancy Bounded Linear Search is suspiciously similar to the index() method function of a list. Rewrite this using uniq.index instead of the Bounded Linear Search in step 2.
When we look the set collection, you’ll see another way to tackle this problem.
Binary Search. This is not as universally useful as the Bounded Linear Search (above) because it requires the data be sorted.
Binary Search a sorted Sequence, seq, for a target value, tgt
Initialize.
.
. This is the midpoint of the sorted sequence.
Divide and Conquer.
While
.
If
:
. Move h to the midpoint.
If
:
. Move l to the midpoint.
. Compute a midpoint of the new, smaller sequence.
Result.
If
: return m
If
: return -1 as a code for “not found”.
Quicksort. The super-fast sort routine
As a series of loops it is rather complex. As a recursion it is quite short. This is the same basic algorithm in the C libraries.
Quicksort proceeds by partitioning the list into two regions: one has all of the high values, the other has all the low values. Each of these regions is then individually sorted into order using the quicksort algorithm. This means the each region will be subdivided and sorted.
For now, we’ll sort an array of simple numbers. Later, we can generalize this to sort generic objects.
Quicksort a List, a between elements lo and hi
Partition
Initialize.
.
Setup for partitioning between ls and hs.
.
Swap To Partition.
while
:
If
: increment ls by 1. Move the low boundary of the partitioning.
If
: swap the values
.
If
: decrement hs by 1. Move the high boundary of the partitioning.
If
:, swap the values
.
Quicksort Each Partition.
QuickSort( a , lo, middle )
QuickSort( a , middle+1, hi )
Recursive Search. This is also a binary search: it works using a design called “divide and conquer”. Rather than search the whole list, we divide it in half and search just half the list. This version, however is defined with a recusive function instead of a loop. This can often be faster than the looping version shown above.
Recursive Search a List, seq for a target, tgt, in the region between elements lo and hi.
: return -1 as a code for “not found”.
.
: return m.
: return
recursiveSearch ( seq, tgt, lo, m )
: return
recursiveSearch( seq, tgt, m+1, hi )Sieve of Eratosthenes. This is an algorithm which locates prime numbers. A prime number can only be divided evenly by 1 and itself. We locate primes by making a table of all numbers, and then crossing out the numbers which are multiples of other numbers. What is left must be prime.
Sieve of Eratosthenes
Initialize. Create a list, prime of 5000 booleans, all True, initially.
.
Iterate.
While
.
Find Next Prime.
While
:
Increment p by 1.
Remove Multiples. At this point, p is prime.
Set
.
while
.
.
Increment k by p.
Next p. Increment p by 1.
Report. At this point, for all p, if prime [ p ] is true, p is prime.
while
:
if
: print p
The reporting step is a “filter” operation. We’re creating a list from a source range and a filter rule. This is ideal for a list comprehension. We’ll look at these in List Comprehensions.
Formally, we can say that the primes are the set of values defined by
.
This formalism looks a little bit like a list comprehension.
Polynomial Arithmetic. We can represent numbers as polynomials. We can represent polynomials as arrays of their coefficients. This is covered in detail in [Knuth73], section 2.2.4 algorithms A and M.
Example:
has the
following coefficients: ( 4, 0, 3, 1 ).
The polynomial
is represented as ( 2, -3, -4 ).
The sum of these is
; ( 4, 2, 0, -3 ).
The product these is
;
( 8, -12, -10, -7, -15, -4 ).
You can apply this to large decimal numbers. In this case,
x is 10, and the
coefficients must all be between 0 and x-1. For
example,
.
Add Polynomials, p, q
Result Size.
.
Pad P?
If
:
Set p1 to a tuple of
zeros + p.
Else: Set p1 to p.
Pad Q?
If
:
Set q1 t a tuple of
zeroes + q.
Else, Set q1 to q.
Add. Add matching elements from p1 and q1 to create result, r.
Result. Return r as the sum of p and q.
Multiply Polynomials, x, y
Result Size.
.
Initialize the result list, r, to all zeros, with a size of
.
For all elements of x.
while
:
For all elements of y. while
:
Set
.
Result. Return a tuple made from r as the product of x and y.
Random Number Evaluation. Before using a new random number generator, it is wise to evaluate the degree of randomness in the numbers produced. A variety of clever algorithms look for certain types of expected distributions of numbers, pairs, triples, etc. This is one of many random number tests.
Use random.random() to generate an array of random samples. These numbers will be uniform over the interval 0..1
Distribution test of a sequence of random samples, U
. Multiply by 10 and truncate and integer to get a
a new value in the range 0 to 9.Dutch National Flag. A challenging problem, one of the hardest in this set. This is from Edsger Dijkstra’s book, A Discipline of Programming [Dijkstra76].
Imagine a board with a row of holes filled with red, white, and blue pegs. Develop an algorithm which will swap pegs to make three bands of red, white, and blue (like the Dutch flag). You must also satisfy this additional constraint: each peg must be examined exactly once.
Without the additional constraint, this is a relatively simple sorting problem. The additional constraint requires that instead of a simple sort which passes over the data several times, we need a more clever sort.
Hint: You will need four partitions in the array. Initially, every peg is in the “Unknown” partition. The other three partitions (“Red”, “White” and “Blue”) are empty. As the algorithm proceeds, pegs are swapped into the Red, White or Blue partition from the Unknown partition. When you are done, the unknown partition is reduced to zero elements, and the other three partitions have known numbers of elements.