When we use an operator, like + or *, what happens depends on the types of the objects involved. When we say c*2, the value depends on the type of c . If c is numeric, then 2 may have to be converted to the same type of number as c, and the answer will be a number. If, however, c is a sequence, the result is a new sequence.
>>> c=8.0
>>> c*2
16.0
>>> c="8.0"
>>> c*2
'8.08.0'
>>> c=(8,0)
>>> c*2
(8, 0, 8, 0)
The selection of appropriate behavior is accomplished by the relatively simple mechanism of special method names within Python. Each class of objects, either built-in or created by a programmer, can provide the required special method names to create the intimate relationship between the class, the built-in functions and the mathematical operators.
We’ll look at the general principles in Semantics of Special Methods.
If you provide special methods, you can make your class behave like a built-in class. Your class can participate seamlessly with built-in Python functions like str(), len(), repr(). These are basic special methods, covered in Basic Special Methods. We’ll look at some special attributes in Special Attribute Names.
Your class can also participate with the usual mathematical operators like + and *. We’ll look at this in Numeric Type Special Methods.
Additionally, your class could also use the collection operators in a manner similar to a dict, set, tuple or list. We’ll look at this in Collection Special Method Names, Collection Special Method Names for Iterators and Iterable, Collection Special Method Names for Sequences, Collection Special Method Names for Sets and Collection Special Method Names for Mappings.
We’ll look at a few examples in Mapping Example and Iterator Examples.
We’ll look at special names for handling attributes are handled in Attributes, Properties and Descriptors.
Additionally, we can extend built-in classes. We do this by extending some of the special methods to do additional or different things. We’ll examine this in Extending Built-In Classes.
Python has a number of language features that interact with the built-in data types. For example, objects of all built-in types can be converted to strings. You can use the built-in str() function to perform these conversions. The str() function invokes the __str__() special method of the given object. In effect, str(a) is evaluated as a.__str__().
When you create your own class, you must supply the specially-named method, __str__(), that the built-in str() function can use to successfully convert your classes values to strings. The default implementation of __str__() is pretty lame; you’ll always want to override it.
In Special Method Names we introduced a few special method names. We looked at __init__(), which is evaluated implicitly when an object is created. We looked at __str__(), which is used by the str() function and __repr__() that is used by the repr() function.
A huge number of Python features work through these special method names. When you provide appropriate special methods for your class, it behaves more like a built-in class.
You may be suspicious that the special method name __str__() matches the built-in function str(). There is no simple, obvious rule. Many of the built-in functions invoke specially-named methods of the class that are similar. The operators and other special symbols, however, can’t have a simple rule for pairing operators with special method names. You’ll have to actually read the documentation for built-in functions (Library Reference, section 2.1) and special method names (Language Reference, section 3.3) to understand all of the relationships.
Categories of Special Method Names. The special methods fall into several broad categories. The categories are defined by the kind of behavior your class should exhibit.
| Basic Object Behaviors: | |
|---|---|
A number of special method names make your object behave like oher built-in objects. These special methods make your class respond to str(), repr() and various comparison operators. This also includes methods that allow your object to respond to the hash() function, which allows instances of your class to be a key to a mapping. |
|
| Numeric Behaviors: | |
These special methods allow your class to respond to the artithmetic operators: +, -, *, /, %, **, <<, >>, and, or and not. When you implement these special methods, your class will behave like the built-in numeric types. |
|
| Container Behaviors: | |
If your new class is a container (or a collection), there are a number of methods required so that your class can behave like the built-in collection types (sequence, set, mapping). Section 3.4.5 of the Python Language Reference calls them “containers”. However, we’ll call them collections below. |
|
| Iterator Behavior: | |
An iterator has a unique protcol. The for statement requires an __iter__() method to product an iterator for an iterable object. The iterator must provide the next() method to actually do the iteration. While this isn’t tied to a collection, it’s commonly used with collections, so we’ll show this with the collection special names below. |
|
| Attribute Handling Behavior: | |
Some special methods customize how your class responds to the . operator for manpulating attributes. For example, when you evaluate object.attr. This is commonly used when attribute manipulation is more complex than simply locating an attribute that was defined by __init__(). |
|
| Function Behavior: | |
You can make your object behave like a function. When you define the method __call__(), your object is callable , and can be used as if it was a function. |
|
| Statement Interaction: | |
There are a few special methods required by statements. The for statement requires an __iter__() method to product an iterator for an iterable object. The iterator object must implement a next() method. The with statement requires __enter__() and __exit__() methods. |
|
In addition to __init__() and __str__() there are a number of methods which are appropriate for classes of all kinds.
As part of creating a class definition, Python adds a number of special attributes. These are informational in nature, and cannot not be easily be changed except by redefining the class or function, or reimporting the module.
| __class__: | This object’s class name. The class has a __name__ attribute which is the name of the class. |
|---|---|
| __module__: | The module in which the class was defined. |
| __dict__: | The dictionary which contains the object’s attributes and methods. |
| __bases__: | The base classes for this class. These are also called superclasses. |
| __doc__: | The documentation string. This is part of the response produced by the help() function. |
Here’s an example of how the class docstring is used to produce the help() results for a class.
import random
print random.Random.__doc__
help(random.Random)
When creating a new numeric data type, you must provide definitions for the essential mathematical and logical operators. When we write an expression using the usual +, -, *, /, //, %, or **, Python transforms this to method function calls.
Consider the following:
v1= MyClass(10,20)
v2= MyClass(20,40)
x = v1 + v2
In this case, Python will evaluate line 3 as if you had written:
x = v1.__add__( v2 )
Every arithmetic operator is transformed into a method function call. By defining the numeric special methods, your class willwork with the built-in arithmetic operators. There are, however, some subtleties to this.
Forward, Reverse and In-Place Method Functions. There are as many as three variant methods required to implement each operation. For example, * is implemented by __mul__(), __rmul__() and __imul__(). There are forward and reverse special methods so that you can assure that your operator is properly commutative. There is an in-place special method so that you can implement augmented assignment efficiently (see Augmented Assignment).
You don’t need to implement all three versions. If you implement just the forward version, and your program does nothing too odd or unusual, everything will work out well. The reverse name is used for special situations that involve objects of multiple classes.
Python makes two attempts to locate an appropriate method function for an operator. First, it tries a class based on the left-hand operand using the “forward” name. If no suitable special method is found, it tries the the right-hand operand, using the “reverse” name.
Additionally, we can return the special value NotImplemented to indicate that a specific version of a method function is not implemented. This will cause Python to skip this method function and move on to an alternative.
Consider the following:
v1= MyClass(10,20)
x = v1 * 14
y = 28 * v1
Both lines 2 and 3 require conversions between the built-in integer type and MyClass. For line 2, the forward name is used. The expression v1*14 is evaluated as if it was
x = v1.__mul__( 14 )
For line 3, the reverse name is used. The expression 28*v1 is evaluated as if it was
y = v1.__rmul__( 28 )
Note
Python 3 and Coercion
Historically, as Python has evolved, so have the ins and outs of argument coercion from data type to data type. We’ve omitted the real details of the coercion rules.
In Python 3.0, the older notion of type coercion and the coerce() function will be dropped altogether, so we’ll focus on the enduring features that will be preserved.
Section 3.4.8 of the Python Language Reference covers this in more detail; along with the caveat that the Python 2 rules have gotten too complex.
The Operator Algorithm. The algorithm for determing what happens with x op y is approximately as follows.
Note that a special method function can return the value NotImplemented. This indicates that the operation can’t work directly only the values, and another operation should be chosen. The rules provide for a number of alternative operations, this allows a class to be designed in a way that will cooperate successfully with potential future subclasses.
The following functions are the “forward” operations, used to implement the associated expressions.
| method function | original expression |
| __add__(other)() | self + other |
| __sub__(other)() | self - other |
| __mul__(other)() | self * other |
| __div__(other)() | self / other classical Python 2 division |
| __truediv__(other)() | self / other when from __future__ import division or Python 3 |
| __floordiv__(other)() | self // other |
| __mod__(other)() | self % other |
| __divmod__(other)() | divmod( self, other ) |
| __pow__(other [, modulo] )() | self ** other |
| __lshift__(other)() | self << other |
| __rshift__(other)() | self >> other |
| __and__(other)() | self & other |
| __or__(other)() | self | other |
| __xor__(other)() | self ^ other |
The method functions in this group are used to resolve operators by attempting them using a reversed sense.
| method function | original expression |
| __radd__(other)() | other + self |
| __rsub__(other)() | other - self |
| __rmul__(other)() | other * self |
| __rdiv__(other)() | other / self classical Python 2 division |
| __rtruediv__(other)() | other / self when from __future__ import division or Python 3 |
| __rfloordiv__(other)() | other // self |
| __rmod__(other)() | other % self |
| __rdivmod__(other)() | divmod( other, self ) |
| __rpow__(other [,modulo])() | other ** self |
| __rlshift__(other)() | other << self |
| __rrshift__(other)() | other >> self |
| __rand__(other)() | other & self |
| __ror__(other)() | other | self |
| __rxor__(other)() | other ^ self |
The method functions in this group are used to resolve operators by attempting them using an incremental sense.
| method function | original expression |
| __iadd__(other)() | self += other |
| __isub__(other)() | self -= other |
| __imul__(other)() | self *= other |
| __idiv__(other)() | self /= other classical Python 2 division |
| __itruediv__(other)() | self /= other when from __future__ import division or Python 3 |
| __ifloordiv__(other)() | self //= other |
| __imod__(other)() | self %= other |
| __ipow__(other [,modulo])() | self **= other |
| __ilshift__(other)() | self <<= other |
| __irshift__(other)() | self >>= other |
| __iand__(other)() | self &= other |
| __ior__(other)() | self |= other |
| __ixor__(other)() | self ^= other |
The method functions in the following group implement the basic unary operators.
| method function | original expression |
| __neg__() | - self |
| __pos__() | + self |
| __abs__() | abs( self ) |
| __invert__() | ~ self |
| __complex__() | complex( self ) |
| __int__() | int( self ) |
| __long__() | long( self ) |
| __float__() | float( self ) |
| __oct__() | oct( self ) |
| __hex__() | hex( self ) |
| __index__() | sequence[self], usually as part of a slicing operation which required integers |
Rational Number Example. Consider a small example of a number-like class. The Rational Numbers exercise in Classes describes the basic structure of a class to handle rational math, where every number is represented as a fraction. We’ll add some of the special methods required to make this a proper numeric type. We’ll finish this in the exercises.
class Rational( object ):
def __init__( self, num, denom= 1L ):
self.n= long(num)
self.d= long(denom)
def __add__( self, other ):
return Rational( self.n*other.d + other.n*self.d,
self.d*other.d )
def __str__( self ):
return "%d/%d" % ( self.n, self.d )
This class has enough methods defined to allow us to add fractions as follows:
>>> x = Rational( 3, 4 )
>>> y = Rational( 1, 3 )
>>> x + y
7/12
In order to complete this class, we would need to provide most of the rest of the basic special method names (there is almost never a need to provide a definition for __del__()). We would also complete the numeric special method names.
Additionally, we would have to provide correct algorithms that reduced fractions, plus an additional conversion to respond with a mixed number instead of an improper fraction. We’ll revisit this in the exercises.
Conversions From Other Types. For your class to be used successfully, your new numeric type should work in conjunction with existing Python types. You will need to use the isinstance() function to examine the arguments and make appropriate conversions.
Consider the following expressions:
x = Rational( 22, 7 )
y = x+3
z = x+0.5
Our original __add__() method assumed that the other object is a Rational object. But in this case, we’ve provided int and float values for other. Generally, numeric classes must be implemented with tests for various other data types and appropriate conversions.
We have to use the isinstance() function to perform checks like the following: isinstance( other, int ). This allows us to detect the various Python built-in types.
If the result of isinstance( other, types ) is True in any of the following cases, some type of simple conversion should be done, if possible.
complex. isinstance( other, complex ). You may want to raise an exception here, since it’s hard to see how to make rational fractions and complex numbers conformable. If this is a common situation in your application, you might need to write an even more sophisticated class that implements complex numbers as a kind of rational fraction. Another choice is to write a version of the abs() function of the complex number, which creates a proper rational fraction for the complex magnitude of the given value.
float. isinstance( other, float ). One choice is to truncate the value of other to long , using the built-in long() function and treat it as a whole number, the other choice is to determine a fraction that approximates the floating point value.
int or long. isinstance( other, (int,long) ). Any of these means that the other value is clearly the numerator of a fraction, with a denominator of 1.
string. isinstance( other, basestring ). We might try to convert the other value to a long using the built-in long() function. If the conversion fails, we could try a float. The exception that’s thrown from any of the attempted conversions will make the error obvious.
The basestring type, by the way, is the superclass for ASCII strings ( str ) and Unicode strings ( unicode ).
Rational. isinstance( other, Rational ). This indicates that the other value is an instance of our Rational class; we can do the processing as expected, knowing that the object has all the methods and attributes we need.
Here is a version of __sub__() with an example of type checking. If the other argument is an instance of the class Rational, we can perform the subtract operation. Otherwise, we attempt to convert the other argument to an instance of Rational and attempt the subtraction between two Rationals.
def __sub__( self, other ):
if isinstance(other,Rational):
return Rational( self.n*other.d - other.n*self.d, self.d*other.d )
else:
return self - Rational(long(other))
An alternative to the last line of code is the following.
return Rational( self.n-long(other)*self.d, self.d )
While this second version performs somewhat quicker, it expresses the basic rational addition algorithm twice, once in the if: suite and again in the else: suite. A principle of object oriented programming is to maximize reuse and minimize restating an algorithm. My preference is to state the algorithm exactly once and reuse it as much as possible.
Reverse Operators. In many cases, Python will reverse the two operands, and use a function like __rsub__() or __rdiv__(). For example:
def __rsub__( self, other ):
if isinstance(other,Rational):
return Rational( other.n*self.d - self.n*other.d,
self.d*other.d )
else:
return Rational(long(other)) - self
You can explore this behavior with short test programs like the following:
x = Rational( 3,4 )
print x-5
print 5-x
The various collection special method names can be organized several different ways. Above, in Semantics of Special Methods we claimed that a bunch of special method names were related to “container” and “iterator” behavior. These categories from the language reference don’t tell the whole story.
Python gives us additional tools to create classes that behave like the built-in collection classes. We can use the abstract base classes in the collections module to jump-start our definition of new types of collections.
Each abstract base class (ABC) in the collections module provides a common feature (or set of features) with the method functions that are required to implement that feature. In some cases, the features build on each other, and a number of method functions are required.
Since each of the ABC classes is abstract, they’re missing the implementation of one or more methods. To use these classes, you’ll have to provide the necessary methods.
One very important consequence of using the collections base classes is that it creates standardized names for the various features. This simplifies the assertions that might be required when checking the argument values to a function or method function.
For more information, see section 9.3.1 of the Python Library Reference, as well as PEP 3119.
Some Foundational Definitions. We’ll look at some foundational abstract classes first. Each of these defines a small group of fundamental features. We’ll use this in the next section to build more sophisticated classes.
We’ll look at the following:
To be a Container object, the class must provide the __contains__() method.
This example is a little silly, but it shows a tuple-like container that secretly adds an additional item.
class BonusContainer( collections.Container ):
def __init__( self, \*members ):
self.members= members + ( 'SecretKey', )
def __contains__( self, value ):
return value in self.members
To be a Hashable object, the class must provide the __hash__() method. This is a requirement for any object we might want to use as a key to a dictionary.
Here’s an example of a class that creates a hash, suitable for use in a dictionary.
class StockBlock( collections.Hashable ):
def __init__( self, name, price, shares ):
self.name= name
self.price= price
self.shares= shares
self._hash= hash(self.name)+hash(self.price)+hash(self.shares)
def __hash__( self ):
return self._hash
To be a Sized object, the class must provide the __len__() method.
To be a Callable object, the class must provie the __call__() method.
Functions are callable objects, but we can also define a class that creates callable objects, similar to a function definition.
This method is called when the object is used as if it were a function. We might create and use callable object with something like the following.
callable_object = MyCallable()
callable_object( argument, values )
Here’s an example of a callable class definition. We can use this to create callable objects that are – essentially – functions.
class TimesX( collections.Callable ):
def __init__( self, factor ):
self.factor= factor
def __call__( self, argument ):
return argument * self.factor
We can use this class to create callable objects as follows:
>>> times_2= TimesX(2)
>>> times_2( 5 )
10
>>> import math
>>> times_pi= TimesX(math.pi)
>>> times_pi( 3*3 )
28.274333882308138
The following collection class definitions introduce special method names to make your class respond to the iterator protocols used by the for statement.
We’ll look at defining iterable contains and iterators in depth in Collection Special Method Names for Iterators and Iterable.
To be an Iterable object, the class must provide the __iter__() method.
Generally, this will look like the following.
import collections
class MyIterable( collections.Iterable ):
# all the other methods of the MyIterable collection
def __iter__( self ):
return MyIteratorClass( self )
This means that we have a class that extends collections.Iterator which will control the iteration through the given MyIterable collection.
These two classes are sometimes called “friends”, since the the Iterator often has deeper access to the Iterable.
To be an Iterator object, the class must provide the __next__() method. Additionally, an Iterator is itself Iterable, so it must provide a __iter__() method.
Generally, the __iter__() method just does return self.
Important
Python 3
The the Iterator method name of __next__() is focused on Python 3.
For Python 2 compatability, you might want to also defined next().
def next( self ):
return self.__next__()
Note there are still more features of iterators. The Python Enhancement Proposal (PEP 342) describes some considerably more advanced features of iterators and the yield statement.
The following collection class definitions introduce special method names to make your class behave like a Sequence (tuple) or a Mutable Sequence (list).
A Sequence object (essentially a tuple) is based on three more fundamental definitions: Sized, Iterable and Container. As such, the class must define a number of methods including __contains__(), __len__(), __iter__().
A Sequence must define a __getitem__() method.
Additionally, methods like __reversed__(), index() and count() are also sensible for this class. The collections.Sequence provides defaults for these methods.
Since we’re talking about an object that’s like a tuple or frozenset, there aren’t any methods for updating or mutating the contents.
Note also, that Hashable isn’t part of a Sequence. You might want to add Hashable if your Sequence can support a fixed hash value, suitable for use as dictionary key.
This class provides a default implementation of __iter__() that is based on using a range of values and calling __getitem__().
Return the value at the given index in the Sequence.
The __getitem__() method function should be prepared for the key to be either a simple integer or a slice object. When called with an integer, it returns an element of the sequence or raises IndexError.
A slice is a simple object with three attributes: start, stop and step. When called with a slice object, it returns another Sequence.
The following examples show common slice situations.
Here’s an example of a simple class that behaves like a tuple, with some restrictions.
import collections
class Point( collections.Sequence ):
def __init__( self, x, y ):
self.x= x
self.y= y
def __len__( self ):
return 2
def __contains__( self, key ):
return self.x==key or self.y==key
def __getitem__( self, position ):
if position in ( 0, -2 ):
return self.x
elif position in ( 1, -1 ):
return self.y
else:
raise IndexError
A MutableSequence exteds a Sequence. It requires all of the Sequence methods to be defined. Plus it has some additional methods. To be mutable, it must have a way to update and remove items. The methods __setitem__(), __delitem__(), and insert() must be defined to update a MutableSequence.
There are of course, numerous additional methods that are provided by default. Any of the optional methods from Sequence, plus append(), reverse(), extend(), pop(), remove(), and __iadd__(). You can override these definitions if you want to improve their performance.
The __getitem__(), __setitem__() and __delitem__() method function should be prepared for the index to be either a simple integer, a slice object, or a tuple of slice objects.
The following collection class definitions introduce special method names to make your class behave like a Set (frozenset) or a MutableSet (set).
A Set, like a basic Sequence, is based on three more fundamental definitions: Sized, Iterable, Container. The basic collections.Set is an immutable set; it’s the basis for the built-in frozenset. A mutable set will build on this definition.
As such, the class must define a number of methods including __contains__(), __len__(), __iter__().
Since, generally, a set simply checks for membership, we don’t need too much more.
The comparison operations (__le__(), __lt__(), __eq__(), __ne__(), __gt__(), __ge__(), __and__(), __or__(), __sub__(), __xor__(), and isdisjoint()) have default definitions. You can override these for performance reasons.
A MutableSet extends a Set. It requires all of the Set methods to be defined. Plus it has some additional methods. To be mutable, it must have the methods add() and discard() to update the MutableSet. The collections.MutableSet is is the basis for the built-in set.
The collections.MutableSet provides the following method functions clear(), pop(), remove(). These are based on our supplied add() and discard()
Also, the following operators are provided so that a MutableSet can be updated with another set: __ior__(), __iand__(), __ixor__(), and __isub__().
The following collection class definitions introduce special method names to make your class behave like a Mapping or a MutableMapping (dict).
A Mapping, like a basic Sequence, is based on three more fundamental definitions: Sized, Iterable, Container. The basic collections.Mapping is the definition of an immutable mapping. A mutable mapping (like a dict) will build on this definition.
As such, the class must define a number of methods including __contains__(), __len__(), __iter__().
A Mapping must define a __getitem__() method.
Additionally, default methods will be provided for __contains__(), keys(), items(), values(), get(), __eq__(), and __ne__(). The equality test compares the list created by items() to assure that each item tuple has the same key and value.
A MutableMapping exteds a Mapping. It requires all of the Set methods to be defined. Plus it has some additional methods. To be mutable, the methods __setitem__() and __setitem__() must be defined.
Also, methods are provided for pop(), popitem(), clear(), update(), and setdefault()
Returns the value corresponding to key, or raises KeyError.
A collections.defaultdict does not raise an exception. For keys that don’t exist, this version of a MutableMapping creates a default value and then returns that.
Beyond these two base classes, there are some additional classes that help you to define a “view” that’s based on a mapping.
An object of this class is built from an existing Mapping, and behaves like a Set that contains the keys from the exsting Mapping.
The methods from Sized (__len__()) and Set (__contains__() and __iter__()) are defined.
An object of this class is built from an existing Mapping, and behaves like a Set that contains the values from the exsting Mapping.
Unlike a proper Set, however, there may appear to be multiple copies of a given value.
The methods from Sized (__len__()) and Set (__contains__() and __iter__()) are defined.
An object of this class is built from an existing Mapping, and behaves like a Set that contains a sequence of ( key, value ) tuples from the exsting Mapping.
The methods from Sized (__len__()) and Set (__contains__() and __iter__()) are defined.
Note that __contains__() checks for the presence of a ( key, value ) 2-tuple.
An immutable mapping is a kind of translation from key to value. The mapping is fixed when the object is created and cannot be updated.
Here’s an example of a small class to define an immutable mapping that we’re calling a Translation.
Note that our immutable mapping happens to have a plain old dict under the hood.
class Translation( collections.Mapping ):
def __init__( self, ** kw ):
self._map= kw
def __len__( self ):
return len(self._map)
def __contains__( self, key ):
return key in self._map
def __iter__( self ):
return iter(self._map)
def __getitem__( self, key ):
return self._map[key]
Here’s a transcript of using our Translation class to create an object that translates some names to numeric values.
>>> c = Translation( red=0, green=1, blue=2 )
>>> c['red']
0
>>> c['white']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in __getitem__
KeyError: 'white'
>>> c['black']= 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Translation' object does not support item assignment
>>> for nm in c:
... print nm
...
blue
green
red
The built-in sequence types (list, tuple, string) all produce iterator objects for use by the for statement. The set, frozenset, dict and file objects also produce an iterator.
In addition to defining ordinary generator methods by using the yield statement, your classes can also produce iterator objects. This can make a program slightly simpler to read by assuring that loops are simple, obvious for statements.
Easy Iterators. When writing a collection-like class, you can simply write method functions that include the yield statement.
class Deck( object ):
def __init__( self ):
# Create the self.cards container.
def deal( self ):
self.shuffle()
for c in self.cards:
yield c
The deal() method is an iterator. We can use this iterator as follows.
d= Deck()
for card in d.deal():
print c
This is the same technique covered in Iterators and Generators, except used with method functions instead of stand-alone functions.
Unique Iterator Classes. Generally, an iterator is an object we’ve designed to help us use a more complex container. Consequently, the container will usually contain a factory method which creates iterators associated with the container. A container will implement the special method __iter__() to emit an iterator properly configured to handle the container.
When we evaluate iter( someList ), the object, someList, must return an iterator object ready to be used with a for statement. The way this works is the iter() function evaluates the __iter__() method function of the object, someList. The objects __iter__() method function creates the object to be used as an iterator. We’ll do something similar in our own classes.
In the following example classes, we’ll create a class which wraps a list and provides and a specialized iterator that yields only non-zero values of the collection.
import collections
class DataSamples( collections.Iterable, collections.Sized ):
def __init__( self, aList=None ):
self.values= [] if aList is None or aList
def __iter__( self ):
return NonZeroIter( self )
def __len__( self ):
return len( self.values )
def __getitem__( self, index ):
return self.values[index]
class NonZeroIter( collections.Iterator ):
def __init__( self, aDataSamples ):
self.ds= aDataSamples
self.pos= -1
def __next__( self ):
while self.pos+1 != len(self.ds) and self.ds[self.pos+1] == 0:
self.pos += 1
if self.pos+1 == len( self.ds ):
raise StopIteration
self.pos += 1
return self.ds[self.pos]
def next( self ):
return self.__next__()
def __iter__( self ):
return self
We can make use of this iterator as follows.
ds = DataSamples( [0,1,2,0,3,0] )
for value in ds:
print value
The for statement calls iter(ds) implicitly, which calls ds.__iter__(), which creates the NonZeroIter instance. The for statement then calls the next() method of this iterator object to get the non-zero values from the DataSamples object. When the iterator finally raises the StopIteration exception, the for statement finishes normally.
We can extend all of Python’s built-in classes. This allows us to add or modify features of the data types that come with Python. This may save us from having to build a program from scratch.
Here’s a quick example of a class which extends the built-in list class by adding a new method.
>>> class SumList( list ):
... def sum( self ):
... return sum( self )
...
>>> x= SumList( [1, 3, 4] )
>>> x
[1, 3, 4]
>>> x.sum()
8
>>> x.append( 5 )
>>> x.sum()
13
Clearly, we can extend a class like list with additiona; methods for various statistical calculations. Each function can be added as easily as the SumList.sum() function in the above example.
Here’s an exmaple of extending a dictionary with simple statistical functions. We based on dictionary on collections.defaultdict because it makes it very simple to create a frequency table. See Default Dictionaries for more information on default dictionaries.
>>> from collections import defaultdict
>>> class StatDict( defaultdict ):
... def __init__( self, valueList=None ):
... super( StatDict, self ).__init__( int, valueList )
... def sum( self ):
... return sum( k*self[k] for k in self )
...
>>> x = StatDict( [ (2,1), (3,1), (4,2) ] )
>>> x.sum()
13
>>> x[5] += 1
>>> x.sum()
18
>>> x[5] += 1
>>> x.sum()
23
Each time we increment the frequency of a numeric value, the defaultdict will add the integer count automatically.
A 2-dimensional point is a coordinate pair, an x and y value. If we limit the range to the range 0 to 2**16, we can do a few extra operations quickly.
Develop the basic routines for __init__(), __repr__(), __str__(). The __hash__() function can simply combine x and y via x<<16+y.
Develop a test routine that creates a sequence of points.
Also, be sure to develop a test that uses points as keys in a dictionary.
Finish the Rational number class by adding all of the required special methods. The Rational Numbers exercise in Classes describes the basic structure of a class to handle rational math, where every number is represented as a fraction.
Currency comes in denominations. For instance, US currency comes in $100, $50, $20, $10, $5, $1, $.50, $.25, $.10, $.05, and $.01 denominations. Parker Brothers Monopoly™ game has currency in 500, 100, 50, 20, 10, 5 and 1. Prior to 1971, English currency had £ 50, £ 20, £ 10, £ 5, £ 1, shillings (1/12 of a pound) and pence (1/20 of a shilling). An amount of money can be represented as an appropriate tuple of integers, each of which represents the specific numbers of each denomination. For instance, one representation for $12.98 in US currency is (0, 0, 0, 1, 0, 2, 0, 3, 2, 0, 3).
Each subclass of Currency has a specific mix of denominations. We might define subclasses for US currency, Monopoly currency or old English currency. These classes would differ in the list of currencies.
An object of class currency would be created with a specific mix of denominatioons. The superclass should include operations to add and subtract Currency objects. An __iadd__( currency )() method, for example would add the denominations in currency to this object’s various denominations. An __isub__( currency )() method, for example would subtract the denominations in currency to this object’s various denominations; in the event of attempting to subtract more than is available, the object would raise an exception.
Be sure to define the various conversions to float, int and long so that the total value of the collection of bills and coins can be reported easily.
An interesting problem is to translate a decimal amount into appropriate currency. Note that numbers like 0.10 don’t have a precise floating-point representation; floating point numbers are based on powers of 2, and 0.10 can only be approximated by a finite-precision binary fraction. For US currency, it’s best to work in pennies, representing $1.00 as 100.
Develop a method which will translate a given target amount, t, into an
appropriate mixture of currency denominations. In this case, we can
iterate through the denominations from largest to smallest, determining
the largest quantity, q of a denomination, d, such that
.
This version doesn’t depend on the current value of the Currency object.
More Adanced Solution. A more advanced version is to create a Currency object with a given value; this would represent the money in a cash drawer, for example. A method of this object would make an amount of money from only the available currency in the cash drawer, or raise an exception if it could not be done.
In this case, we iterate through the denominations, d, from largest to
smallest, determining the largest quantity, q, such that
, consistent
with available money in the cash drawer. If we don’t have enough of a
given denomination, it means that we will be using more of the smaller denominations.
One basic test case is to create a currency object with a large amount of money available for making change.
In the following example, we create a cash drawer with $804.55. We accept a payment of $10 as 1 $5, 4 $1, 3 $.25, 2 $.10 and 1 $.05. Then we accept a payment of $20, for a bill of $15.24, meaning we need to pay out $4.76 in change.
drawer = USCurrency( (5,2,6,5,5,5,5,5,5,5,5) )
drawer += USCurrency((0,0,0,0,1,4,0,3,2,1,0))
drawer += USCurrency((0,0,1,0,0,0,0,0,0,0,0))
drawer.payMoney( 4.76 )
Interestingly, if you have $186.91 (one of each bill and coin) you can find it almost impossible to make change. Confronted with impossible situations, this class should raise an UnableToMakeChange exception.
Create a sequence class, StatSeq that can hold a sequence of data values. This class must be a subclass of collections.MutableSequence and define all of the usual sequence operators, including __add__(), __radd__(), __iadd__(), but not __mul__().
The __init__() function should accept a sequence to initialize the collection. The various __add__() functions should append the values from a StatSeq instance as well as from any subclass of collections.Sequence (which includes list and tuple.)
Most importantly, this class shold define the usual statistical functions like mean and standard deviation, described in the exercises after Tuples, Sequence Processing Functions: map(), filter() and reduce() and Sample Class with Statistical Methods in Classes.
Since this class can be used everywhere a sequence is used, interface should match that of built-in sequences, but extra features are now readily available. For a test, something like the following should be used:
import random
samples = StatSeq( [ random.randrange(6) for i in range(100) ] )
print samples.mean()
s2= StatSeq()
for i in range(100):
ss.append( random.randrange(6) )
# Also allow s2 += [ random.randrange(6) ]
print s2.mean()
There are two approaches to this, both of which have pros and cons.
Note that the value of mean() does not have to be computed when it is requested. It is possible to simply track the changing sum of the sequence and length of the sequence during changes to the values of the sequence.
The sum and length are both set to zero by __init__(). The sum and length are incremented by every __add__(), append(), insert(). They are decremented by pop(), remove(), and and __delitem__(). Finally, there’s a two-part change for __setitem__(): the old value is deducted and the new value is added.
This way the calculation of mean is simply a division operation.
Keeping track of sums and counts can also optimize mode and standard deviation. A similar optimization for median is particularly interesting, as it requires that the sample data is retained by this class in sorted order. This means that each insert must preserve the sorted data set so that the median value can be retrieved without first sorting the entire sequence of data. You can use the bisect module to do this.
There are a number of algorithms for maintaining the data set in sorted order. You can refer to Knuth’s The Art of Computer Programming [Knuth73], and Cormen, Leiserson, Rivest Introduction to Algorithms [Cormen90] which cover this topic completely.
A chessboard can be thought of as a mapping from location names to pieces. There are two common indexing schemes from chessboards: algebraic and descriptive. In algebraic notation the locations have a rank-file address of a number and a letter. In descriptive notation the file is given by the starting piece’s file, rank and player’s color.
See Chess Game Notation for an extension to this exercise.
The algebraic description of the chess board has files from a-h going from white’s left to right. It has ranks from 1-8 going from white’s side (1) to black’s side (8). Board’s are almost always shown with position a1 in the lower left-hand corner and h8 in the upper right, white starts at the bottom of the picture and black starts at the top.
In addition to the simplified algebraic notation, there is also a descriptive notation, which reflects each player’s unique point of view. The descriptive board has a queen’s side (white’s left, files a-d) and a king’s side (white’s right, files e-h). Each rank is numbered starting from the player. White has ranks 1-8 going from white to black. Black, at the same time as ranks 1-8 going back toward white. Each of the 64 spaces on the board has two names, one from white’s point of view and one from black’s.
Translation from descriptive to algebraic is straight-forward. Given the
player’s color and a descriptive location, it can be translated to an
algebraic location. The files translate through a relatively simple
lookup to transform QR to a, QKt to b, QB to c, Q to d, K to e, KB to f,
KKt to g, KR to h. The ranks translate through a simple calculation:
white’s ranks are already in algebraic notation;
for black’s rank of r,
is the algebraic location.
Create a class to represent a chess board. You’ll need to support the special function names to make this a kind of mapping. The __getitem__() function will locate the contents of a space on the board. The __setitem__() function will place a piece at a space on the board. If the key to either function is algebraic (2 characters, lower case file from a-h and digit rank from 1-8), locate the position on the board. If the key is not algebraic, it should be translated to algebraic.
The codes for pieces include the piece name and color. Piece names are traditionally “p” or nothing for pawns, “R” for rook, “N” for knight, “B” for bishop, “Q” for queen and “K” for king. Pawns would be simply the color code “w” or “b”. Other pieces would have two-character names: “Rb” for a black rook, “Qw” for the white queen.
The __init__() method should set the board in the standard starting position:
| piece | algebraic | descriptive | piece code |
| white rooks | a1, h1 | wQR1, wKR1 | Rw |
| white knights | b1, g1 | wQKt1, wKKt1 | Nw |
| white bishop | c1, f1 | wQB1, wKB | Bw |
| white queen | d1 | wQ1 | Qw |
| white king | e1 | wK1 | Kw |
| white pawns | a2-h2 | wQR2-wKR2 | w |
| black rooks | a8, h8 | bQR1, bKR1 | Rb |
| black knights | b8, g8 | bQKt1, bKKt1 | Nb |
| black bishops | c8, f8 | bQB1, bKB1 | Bb |
| black queen | d8 | bQ1 | Qb |
| black king | e8 | bK1 | Kb |
| black pawns | a7-a7 | bQR2-bKR2 | b |
Here’s a sample five-turn game. It includes a full description of each move, and includes the abbreviated chess game notation.
white pawn from e2 to e4; K2 to K5
black pawn from e7 to e5; K2 to K5
white knight from g1 to f3; KKt1 to KB3
black pawn from d7 to d6; Q2 to Q3
white pawn from d2 to d4; Q2 to Q4
black bishop from c8 to g4; QB1 to KKt5
white pawn at d4 takes pawn at e5; Q4 to K5
black bishop at g4 takes knight at f3; KKt5 to KB6
white Q at d1 takes bishop at f3; Q1 to KB3
black pawn at d6 takes e5; Q3 to K4
The main program should be able to place and remove pieces with something like the following:
chess= Board()
# move pawn from white King 2 to King 5
chess['wK5']= chess['wK2']; chess['wK2']= ''
# move pawn from black King 2 to King 5
chess['bK5']= chess['bK2']; chess['bK2']= ''
# algebraic notation to print the board
for rank in [ '8', '7', '6', '5', '4', '3', '2', '1']:
for file in [ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']:
print "%5s" % board[file+rank],
print
The algebraic output can be changed to the following, which some people find simpler.
for rank in ('8','7','6','5','4','3','2','1'):
print "".join(
[ "%5s" % board[file+rank]
for file in ('a','b','c','d','e','f','g','h') ] )
You should also write a move() function to simplify creating the test game. A move typically consists of the piece name, the from position, the to position, plus optional notes regarding check and pawn promotions.
When decoding a log of a chess game in Short Algebraic Notation (SAN), it is often necessary to search for a piece that made a given move. We’ll look at this problem in detail in Chess Game Notation.
There are actually a number of search algorithms, each constrained by the rules for moving a particular piece. For example, the knight makes a short “L”-shaped move and there are only 8 positions on the board from which a knight can start to end up at a given spot. The queen, on the other hand, moves horizontally, vertically or diagonally any distance, and there are as many as 24 starting positions for the queen to end up at a given spot.
This search is simplified by having iterators that know a few rules of chess and an give us a sequence of appropriate rank and file values. We’d like to be able to say something like the following.
piece, move, toPos = ( "Q", "x", "f3" )
for fromPos in aBoard.queenIter( toPos ):
if aBoard[fromPos] == 'Q':
print "Queen from", fromPos, "takes", aBoard[toPos], "at", toPos
We’ll review a few chess definitions for this problem. You can also see Chessboard Locations in Collection Special Method Names for some additional background.
The algebraic description of the chess board has files from a-h going from white’s left to right. It has ranks from 1-8 going from white’s side (1) to black’s side (8). Board’s are almost always shown with position a1 in the lower left-hand corner and h8 in the upper right, white starts at the bottom of the picture and black starts at the top.
We need the following collection of special-purpose iterators.
The kingIter() method has to enumerate the eight positions that surround the king.
The queenIter() method has to enumerate all the positions in the same rank, the same file, and on the diagonals. Each of these must be examined from the queen’s position moving toward the edge of the board. This search from the queen outward allows us to locate blocking pieces that would prevent the queen from making a particular move.
The bishopIter() method has to enumerate all the positions on the diagonals. Each of these must be examined from the bishop’s position moving toward the edge of the board.
The knightIter() method has to enumerate the eight positions that surround the knight, reflecting the knight’s peculiar move of 2 spaces on one axis and 1 space on the other axis. There are four combinations of two ranks and one file and four more combinations of two files and one rank from the ending position. As with the king, no piece can block a knight’s move, so order doesn’t matter.
The rookIter() method has to enumerate all the positions in the same rank and the same file. Each of these must be examined from the rook’s position moving toward the edge of the board.
The pawnIter() method has to enumerate a fairly complex set of positions. Most pawn moves are limited to going forward one rank in the same file.
Since we need to know which direction is forward, we need to know the color of the pawn. For white pawns, forward means the ranks increase from 2 to 8. For black pawns, then, forward means the ranks decrease from 7 down to 1.
Pawn captures involve going forward one rank in an adjacent file. Further complicating the analysis is the ability for a pawn’s first move to be two ranks instead of one.
We note that the queen’s iterator is really a combination of the bishop and the rook. We’ll look at the rook’s iterator, because it is can be adapted to be a bishop iterator, and then those two combined to create the queen iterator.
Given a starting position with a rank of r and a file of f,
we’ll need to examine all ranks starting from r and moving
toward the edge of the board. These are
.
Similarly, we need to examine all of the files starting from f
and moving toward the edge of the board. These are
.
Before doing an comparison, we need to filter the file and rank combinations to assure that they are legal positions. Additionally, we need to stop looking when we’ve encountered a piece of our own color or an opposing piece that isn’t the one we’re searching for. These intervnening pieces “block” the intentended move.