To make sense of class definitions, we’ll talk about objects in The Ubiquitous Object, and review the built-in object classes in The Built-in Classes – A Review. In Data, Processing and Philosophy – What Does It All Mean? we define the basic semantics of objects and the classes which define their structure and behavior.
Our programs create and manipulate data objects. It turns out that all of Python programming boils down to this one theme: a program creates and manipulates objects. After the previous chapters in which we worked with objects we can meaningfully define what an object is. This chapter will show how we can define our own new, unique classes of objects.
Each piece of data has properties that we use to typify or classify the data object. Each object has data and processing, which we can call the object’s attributes and operations.
In the Python language, we write some operations using operators, like + and *. We write other operations as method functions, like a String’s someString.lower() method. And some operations are functions in prefix notation like len(someString).
Under the hood, all operations are implemented by method functions. These functions have generic names but implementations which are specific to each type of data. The method that performs the * operation for a number is different from the method that performs the * for a list. 2*3 and 2*["red",21,2.7] have very different results, which depend on the type of data involved in the operation.
Each type of data, from a simple boolean (like True) to a complex file (created with the file() factory function), has attributes and operations.
Each object is an instance of a class. A class defines the attributes and operations of each object that is a member of the class. We’ll use the word type and class interchangeably.
A typical program will written as a number of class definitions and a final main function. The main function’s job is to create the objects required to perform the job of the program. The program’s behavior is the result of interactions among these objects. This parallels the way that a business enterprise is the net effect the interactions among the people who purchase materials, create products, sell the products, receive payment and manage the finances.
In Getting Our Bearings, we looked around at where we’d been and where we were going. In that section, we reviewed the basic statements and data types of Python. Since we’re rounding another mark, it’s time to get our bearings again, and see what the next leg of our course looks like.
Because it’s easiest to learn by doing, we’ve been using a number of built-in object classes. Here are the types of data we’ve seen so far.
None. A unique constant, handy as a placeholder when no other value is appropriate. A number of built-in functions return values of None. The None literal is the only instance of a special class, NoneType, that has no attributes and a very limited number of operations.
Since there’s only a single instance of None, we compare a variable against the None object with the is operator.
NotImplemented. A unique constant, returned by special methods to indicate that a method is not implemented. This allows Python to try alternative methods if they’re available. The NotImplemented literal is the only instance of a special class, NotImplementedType.
Numeric. The various numeric types have relatively simple, unstructured values. For obvious reasons, these are all immutable.
Boolean (bool). This type has a tiny domain with just two literal values: False and True. A number of other values are equivalent to these two values. There is also a tiny domain of operations, including and, or and not. Some other operators (like the comparisons) produce boolean result values.
Integer or Whole Numbers (int). The literal values are written as strings of digits. These values have a number of operations, including arithmetic operations, special bit-fiddling operations and comparison operations.
Long Integers (long). These are integers of arbitrary length. They grow as needed to precisely represent numeric results. The literal values are written as strings of digits ending with L. These values have a number of operations, including arithmetic operations and comparison operations.
Floating-Point or Scientific Notation (float).
These are numbers coded as a fractional
“mantissa” and an exponent. Scientists and
engineers use powers of 10, as in
.
The Python language abbreviates
the “× 10” with the letter E or e. The
literal values are strings of digits (with a decimal point) and
an optional E or e exponent, for example. 6.022e23.
Most computer processors use a notation based on powers of 2, so ranges and precisions vary. Typically these are called “double precision” in other languages, and are often 64 bits long. These values have a number of operations, including arithmetic operations and comparison operations.
Complex (complex). These are a pair of floating-point numbers of the form (a + bj), where a is the real part and b is the “imaginary” part. These values have a number of operations, including arithmetic operations and comparison operations.
Sequence. The sequence types are collections of objects identified by their order or position, instead of a key. All sequences have a few operations to concatenate and repeat the sequence. Sequences have in and not in operations to determine if an item is part of the sequence. Additionally sequences have the [] operation which selects an item or a slice of items.
Set. A set is a simple collection of objects. There is no ordering or key information. This makes them very efficient. Sets have add() and remove() operations, as well as in and not in operations.
Mapping. A mapping is a collection of objects identified by keys instead of order.
Callable. When we create a function with the def statement, we create a callable object. There are a number of attributes; for example, the __name__, and func_name attributes both have the function’s name. There is one important operation, “calling” the function. That is, performing the eval-apply cycle (see The Evaluate-Aply Rule for a review) to the function’s argument values.
File (file). Python supports several operations on files, most notably reading, writing and closing. Python also provides numerous modules for interacting with the operating system’s management of files.
Beginning in Instant Gratification : The Simplest Possible Conversation we’ve been creating, manipulating and accessing Python objects without asking the deep, philosophical question “What is an object?”
As with other real-world things, it’s easier to provide a lot of examples than it is to work up an elaborate, legalistic definition. Objects are like art: I can’t define it, but I know what I like. As hard as it is, we’ll give the definition a whirl, because it does help some people write better software.
Each object encapsulates both data and processing into a single definition. We’ll sometimes use synonyms and call these two facets structure and behavior, attributes and operations or instance variables and method functions. The choice of terms depends on how philosophical or technical we’re feeling. The structure and behavior terms are the most philosophical; the attribute and operation terms are generic object-oriented design terms. Instances variables and method functions are the specific ways that Python creates attributes and operations to reflect structure and behavior.
In Python, we can understand objects by looking at a number of features, adapted from [Rumbaugh91].
Identity. An object is unique and is distinguishable from all other objects. In the real world, two identical coffee cups occupy different locations on our desk. In the world of a computer’s memory, objects can be identified by their address. Unless we do something special, the built-in id() function gives us a hint about the memory location of an object, revealing the distinction between two objects. We can see this by doing id("abc"), id("defg"), which shows that two distinct objects were being examined.
State. Many objects have a state, and that state is often changeable. The object’s current state is described by its attributes, implemented as instance variables in Python.
Our two nearly identical coffee cups have distinguishing attributes. The locations (back-left corner of desk, on the mouse pad) and the ages (yesterday’s, today’s) are attributes of each cup of coffee. I can change the location attribute by moving a cup around. Even if both cups are on the back-left corner, the cups have unique identity and remain distinct. I can’t easily change the age; today’s coffee remains today’s coffee until enough time has passed that it becomes yesterday’s coffee.
In software world, my two strings ( "abc" and "defg") have different attribute values. Their lengths are different, they respond differently to various method functions like upper() and lower().
As a special case, some objects can be stateless. While most objects have a current state, it is possible for an object to have no attributes, making it like a function. Such objects have no hysteresis – no memory of any previous actions.
Behavior. Objects have behavior. The object’s behavior is defined by its operations, or, in Python terminology, its method functions. Some objects can be termed “passive” because they are used by other objects, and don’t do much processing. Some objects can be termed “active” because they do considerable processing. These distinctions are arbitrary, some objects have passive and active methods.
A coffee cup really only has a few behaviors: it admits additional coffee (to a limit), it stores a finite amount of coffee, and coffee can be removed. Coffee cups are passive and don’t initiate these behaviors. The coffee machine, however, is an active object. The coffee machine has a timer, and can perform its behavior of making coffee autonomously.
String objects have a large number of behaviors, defined by the method functions, many of which we looked at in Sequences of Characters : str and Unicode. All of our collection classes can be considered as passive objects.
Classification. Objects with the same attributes and behavior belong to a common class. Both of our string objects ("abc" and "defg") belong to a common class because they have the same attributes (a string of characters) and the same behavior.
Inheritance. A class can inherit operations and attributes from a parent class, reusing common features. A superclass is a generalization. A subclass overrides superclass features or adds new features, and is a specialization.
Both of our coffee cups are instances of cup, which is a subclass of a more general class, “drinking vessel”. This more general class includes other subclasses like glassware and stemware.
When we described the string data type, we put it into a broader context called sequence and emphasized the common features that all sequence types had. We also emphasized the unique features that defined the various subclasses of sequence. All of the sequence types have the [] operator to select an individual item. Only strings, however, had an upper() method function. Only lists had the append() method function.
Polymorphism. A general operation, named in a superclass, can have different implementations in the various subclasses. We saw this when we noted that almost every class on Python has a + operation. Between two floating-point numbers the + operation adds the numbers, between two lists, however, the + operation concatenates the lists. Because objects of these distinct classes respond to a common operator, they are polymorphic.
Program Design. Up to this point in our programming career, we’ve been looking at our information needs and the available Python structures. If it was a temperature, we used a number; for the color of a space on the Roulette wheel, we used a string. In the case of something more complex, like a pair of dice, we used a function which created a tuple.
As we become more sophisticated, we begin to see that the various types of data that are built-in to Python aren’t exactly what we need. It isn’t possible to foresee all possible problems. Similarly, it isn’t possible to predict all possible kinds of data and processing that will be required to solve the unforeseeable problems. That’s why Python lets us define our own, brand-new types of data.
Class Definition. Python permits us to define our own classes of objects. This allows us to design an object that is an exact description of some part of our problem. We can design objects that reflect a pair of dice, a Roulette wheel, or the procedure for playing the game of Craps. A class definition involves a number of things.
The object’s attributes (also called instance variables) are not formally defined as part of the class. They are generally created by a special method function that is executed each time an object is created. This initialization method function is allocated responsibility for creating the object’s instance variables and assigning their initial values.
Object Creation. After we define the class, we can create instances of the class. Every object is in instance of one of more classes. Each object will have unique identity; it will have a distinct set of instance variables; it will be identified by a unique object identifier. Objects have an internal state, defined by the values assigned to the object’s instance variables. Additionally, each object has behavior based on the definitions of the method functions. An object is said to encapsulate a current state and a set of operations.
Because every object belongs to one or more defined classes, objects share a common definition of their attributes and methods. The class definition can also specify superclasses, which helps provide method functions. We can build a family tree of classes and share superclass definitions among a variety of closely-related subclasses.
It helps to treat each class definition as if the internal implementation details where completely opaque. A class should be considered as if it were a contract that specifies what the class does, but keeps private all of the details of how the class does it. All other objects within an application should use only the defined methods for interacting with an object. When we use a list’s append() method, we know what will happen, but we don’t know precisely how the list object adds the new item to the end of the list. Unlike Java and C++, Python has a relatively limited mechanism for formalizing this distinction between the defined interface and the private implementation of a class.
Life Cycle of an Object. Each object in our program has a lifecycle. The following is typical of most objects.
Important
Class and Instance
Once we’ve defined the class, we only use the class to make individual objects. Objects – instances of a class – do the real work of our program.
When we ask a string to create an upper case version of itself ("hi mom".upper()), we are asking a specific object ("hi mom") to do the work. We don’t ask the general class definition of string to do this. The meaning of str.upper() isn’t very clear.
This can be a little mystifying when we start to define our own classes. The problem usually stems from confusing class definitions with function definitions. We don’t use instances of a function for anything, we use the function itself. Functions, consequently, are a bad model of how class definition works. Classes are a kind of factory for creating objects. Objects do the real work.
The most important examples to keep in mind are string objects, file objects and list objects. These are the most typical examples of the kinds of objects we’ll create. Each string (or file or list) object is an instance of the respective class definition.
Under the hood, the definition of a class creates a new class object. This class object is used to create the instance objects that do the work of our program. The class object is mostly just a container for the suites of statements that define of the method functions of a class. Additionally, a class object can also own class-level variables; these are, in effect, shared by each individual object of that class. They become a kind of semi-global variable, shared by objects of a given class.
Object Identification.
When we evaluate an expression as simple as 3+5, Python creates an integer object with a value of 3, an integer object with a value of 5, then applies a method function to add these values, and create a new object which is the sum.
Look at some of your earlier exercises in Arithmetic and Expressions and identify all of the objects in a given expression. Pay particular attention to each operator (like +, -, * or /) which will create a new object.
Since () merely group expressions, do they create new objects?
Iterator Objects.
In While We Have More To Do : The for Statement we looked at the for statement. In Basic Sequential Collections of Data we looked at how the for statement iterates through a sequence. In Looping Back : Iterators, the for statement and Generators we looked at the iterator and how the for statement makes use of this iterator. A sequence has a method function (iter) which creates the iterator which yields each item of the sequence so the for statement can assign them to a variable.
When you evaluate iter( [ 1,2,3] ), for example, you can see the iterator object being created. This iterator object has a cryptic-looking name, for example, <listiterator object at 0x107afd0>.
Each time you evaluate something like iter( [ 1,2,3] ) you get a slightly different response. Does this indicate that a new object being created? Does this make sense? If each object can capture a unique state, does this mean each iterator is independent?
If we create a single list, for example a=range(100), can we have multiple iterators which provide different views of the same list object, a?
Temporary Objects in Functions.
When a function is being evaluated, objects will be created by each operation in each expression. What happens to those objects? Does any object persist after the function’s evaluation is complete? What happens to the object created by (or named in) the return statement?
Look at some of your earlier exercises in Organizing Programs with Function Definitions. Identify the life-span of all objects created by a specific function.
In a sense, a class is a group of related functions. However, the formal class definition allows you do some additional things that would be inconvenient with a group of related functions.
First, the class acts like a common name for the functions, saving you from have to write elaborate prefixes on your functions. For example, if you had a group of functions to work with a block of stocks, you might prefix each name with sb_ to assure that each function name was unique. This is error-prone and tedious.
Second, the class allows easy sharing of instance variables. All of the instance variables are collected under self, assuring that they are available to each function.
Third, and more important than these technical considerations, is the mental tools of abstraction and encapsulation. When we define classes, we encapsulate some functions so we can set aside the details. This allows us to reduce the complexity of a class to a somewhat simpler abstraction, namely, the functions that interface to the class, not the details of how the class works internally. Thinking at this more abstract level lets us wrap our finite brains around slightly larger problems.
Our first programs tended to have a single, long procedure that made use of many objects. Once we’ve got the hang of programming, we can break that long procedure down into separate pieces, assign each piece to a method function of an object. Breaking long procedures down and allocating them to separate objects is part of our divide and conquer life-style.
As we get more proficient with object definition, we can examine a programming problem by writing a description and looking at the nouns and verbs in that description. The nouns will become objects, defined by classes. The verbs will become method functions. This collaboration among the nouns will fit naturally with the original description of the problem, giving us confidence that our program will work.