Objects: A Retrospective

To make sense of class definitions, we’ll talk about objects in The Ubiquitous Object, and review the built-in object classes in The Built-in Classes – A Review. In Data, Processing and Philosophy – What Does It All Mean? we define the basic semantics of objects and the classes which define their structure and behavior.

The Ubiquitous Object

Our programs create and manipulate data objects. It turns out that all of Python programming boils down to this one theme: a program creates and manipulates objects. After the previous chapters in which we worked with objects we can meaningfully define what an object is. This chapter will show how we can define our own new, unique classes of objects.

Each piece of data has properties that we use to typify or classify the data object. Each object has data and processing, which we can call the object’s attributes and operations.

In the Python language, we write some operations using operators, like + and *. We write other operations as method functions, like a String’s someString.lower() method. And some operations are functions in prefix notation like len(someString).

Under the hood, all operations are implemented by method functions. These functions have generic names but implementations which are specific to each type of data. The method that performs the * operation for a number is different from the method that performs the * for a list. 2*3 and 2*["red",21,2.7] have very different results, which depend on the type of data involved in the operation.

Each type of data, from a simple boolean (like True) to a complex file (created with the file() factory function), has attributes and operations.

  • For the simple, unstructured types, there is only one attribute and that is the value. Booleans and numbers are these kinds of simple, unstructured types. How many attributes can the value True or 3.1415926 have?
  • A more complex type, like a list, will have a collection of items, plus attributes like the length, and the list’s unique hash code. A collection will have method functions to access the collection. Additionally, a mutable collection may have method functions to change the collection by adding and removing elements.
  • A really complex type, like a file, has many attributes, some of which come from outside the Python environment. Attributes include a name, a modification time, a size, permissions. A file is associated with operating system resources, and a file’s operations will move data to or from external devices.

Each object is an instance of a class. A class defines the attributes and operations of each object that is a member of the class. We’ll use the word type and class interchangeably.

A typical program will written as a number of class definitions and a final main function. The main function’s job is to create the objects required to perform the job of the program. The program’s behavior is the result of interactions among these objects. This parallels the way that a business enterprise is the net effect the interactions among the people who purchase materials, create products, sell the products, receive payment and manage the finances.

The Built-in Classes – A Review

In Getting Our Bearings, we looked around at where we’d been and where we were going. In that section, we reviewed the basic statements and data types of Python. Since we’re rounding another mark, it’s time to get our bearings again, and see what the next leg of our course looks like.

Because it’s easiest to learn by doing, we’ve been using a number of built-in object classes. Here are the types of data we’ve seen so far.

  • None. A unique constant, handy as a placeholder when no other value is appropriate. A number of built-in functions return values of None. The None literal is the only instance of a special class, NoneType, that has no attributes and a very limited number of operations.

    Since there’s only a single instance of None, we compare a variable against the None object with the is operator.

  • NotImplemented. A unique constant, returned by special methods to indicate that a method is not implemented. This allows Python to try alternative methods if they’re available. The NotImplemented literal is the only instance of a special class, NotImplementedType.

  • Numeric. The various numeric types have relatively simple, unstructured values. For obvious reasons, these are all immutable.

    • Boolean (bool). This type has a tiny domain with just two literal values: False and True. A number of other values are equivalent to these two values. There is also a tiny domain of operations, including and, or and not. Some other operators (like the comparisons) produce boolean result values.

    • Integer or Whole Numbers (int). The literal values are written as strings of digits. These values have a number of operations, including arithmetic operations, special bit-fiddling operations and comparison operations.

    • Long Integers (long). These are integers of arbitrary length. They grow as needed to precisely represent numeric results. The literal values are written as strings of digits ending with L. These values have a number of operations, including arithmetic operations and comparison operations.

    • Floating-Point or Scientific Notation (float). These are numbers coded as a fractional “mantissa” and an exponent. Scientists and engineers use powers of 10, as in 6.022 \times 10^23. The Python language abbreviates the “× 10” with the letter E or e. The literal values are strings of digits (with a decimal point) and an optional E or e exponent, for example. 6.022e23.

      Most computer processors use a notation based on powers of 2, so ranges and precisions vary. Typically these are called “double precision” in other languages, and are often 64 bits long. These values have a number of operations, including arithmetic operations and comparison operations.

  • Complex (complex). These are a pair of floating-point numbers of the form (a + bj), where a is the real part and b is the “imaginary” part. These values have a number of operations, including arithmetic operations and comparison operations.

  • Sequence. The sequence types are collections of objects identified by their order or position, instead of a key. All sequences have a few operations to concatenate and repeat the sequence. Sequences have in and not in operations to determine if an item is part of the sequence. Additionally sequences have the [] operation which selects an item or a slice of items.

    • Immutable sequences are created as need and can be used but never changed.
      • String (str). A string is a sequence of individual ASCII characters. Strings have a number of operations that return facts about the string or transform the string and create a new string.
      • Unicode (unicode). A Unicode string is a sequence of individual Unicode characters. Unicode strings have a number of operations that return facts about the string or transform the string and create a new string.
      • Tuple (tuple). A tuple is a sequence of Python items. It has a few operations for accessing individual items in the tuple.
    • Mutable sequences can be created, appended to, changed, and have elements deleted.
      • List (list). A list is a sequence of Python items. Operations like append() and pop() can be used to add or remove from a lists. Operations like sort() can change the order of the list.
  • Set. A set is a simple collection of objects. There is no ordering or key information. This makes them very efficient. Sets have add() and remove() operations, as well as in and not in operations.

  • Mapping. A mapping is a collection of objects identified by keys instead of order.

    • Dictionary (dict). A dictionary is a collection of objects (values) which are indexed by other objects (keys). It is like a sequence of key:value pairs, where keys can be found efficiently. Any Python object can be used as the value. Keys have a small restriction: mutable lists and other mappings cannot be used as keys. Dictionaries have the [] operation to select an element from the dictionary. Dictionaries have methods like has_key() to determine if a key is present in the dictionary. Dictionaries also have methods like items(), keys() and values() to produce sequences from the contents of the dictionary.
    • Default Dictionary. We had to import this from the collections package. The defaultdict behaved just like a dictionary in every respect but one. When we attempt to get a value that’s not in the dictionary, it evaluates a default function.
  • Callable. When we create a function with the def statement, we create a callable object. There are a number of attributes; for example, the __name__, and func_name attributes both have the function’s name. There is one important operation, “calling” the function. That is, performing the eval-apply cycle (see The Evaluate-Aply Rule for a review) to the function’s argument values.

  • File (file). Python supports several operations on files, most notably reading, writing and closing. Python also provides numerous modules for interacting with the operating system’s management of files.

Data, Processing and Philosophy – What Does It All Mean?

Beginning in Instant Gratification : The Simplest Possible Conversation we’ve been creating, manipulating and accessing Python objects without asking the deep, philosophical question “What is an object?”

As with other real-world things, it’s easier to provide a lot of examples than it is to work up an elaborate, legalistic definition. Objects are like art: I can’t define it, but I know what I like. As hard as it is, we’ll give the definition a whirl, because it does help some people write better software.

Each object encapsulates both data and processing into a single definition. We’ll sometimes use synonyms and call these two facets structure and behavior, attributes and operations or instance variables and method functions. The choice of terms depends on how philosophical or technical we’re feeling. The structure and behavior terms are the most philosophical; the attribute and operation terms are generic object-oriented design terms. Instances variables and method functions are the specific ways that Python creates attributes and operations to reflect structure and behavior.

In Python, we can understand objects by looking at a number of features, adapted from [Rumbaugh91].

  • Identity. An object is unique and is distinguishable from all other objects. In the real world, two identical coffee cups occupy different locations on our desk. In the world of a computer’s memory, objects can be identified by their address. Unless we do something special, the built-in id() function gives us a hint about the memory location of an object, revealing the distinction between two objects. We can see this by doing id("abc"), id("defg"), which shows that two distinct objects were being examined.

  • State. Many objects have a state, and that state is often changeable. The object’s current state is described by its attributes, implemented as instance variables in Python.

    Our two nearly identical coffee cups have distinguishing attributes. The locations (back-left corner of desk, on the mouse pad) and the ages (yesterday’s, today’s) are attributes of each cup of coffee. I can change the location attribute by moving a cup around. Even if both cups are on the back-left corner, the cups have unique identity and remain distinct. I can’t easily change the age; today’s coffee remains today’s coffee until enough time has passed that it becomes yesterday’s coffee.

    In software world, my two strings ( "abc" and "defg") have different attribute values. Their lengths are different, they respond differently to various method functions like upper() and lower().

    As a special case, some objects can be stateless. While most objects have a current state, it is possible for an object to have no attributes, making it like a function. Such objects have no hysteresis – no memory of any previous actions.

  • Behavior. Objects have behavior. The object’s behavior is defined by its operations, or, in Python terminology, its method functions. Some objects can be termed “passive” because they are used by other objects, and don’t do much processing. Some objects can be termed “active” because they do considerable processing. These distinctions are arbitrary, some objects have passive and active methods.

    A coffee cup really only has a few behaviors: it admits additional coffee (to a limit), it stores a finite amount of coffee, and coffee can be removed. Coffee cups are passive and don’t initiate these behaviors. The coffee machine, however, is an active object. The coffee machine has a timer, and can perform its behavior of making coffee autonomously.

    String objects have a large number of behaviors, defined by the method functions, many of which we looked at in Sequences of Characters : str and Unicode. All of our collection classes can be considered as passive objects.

  • Classification. Objects with the same attributes and behavior belong to a common class. Both of our string objects ("abc" and "defg") belong to a common class because they have the same attributes (a string of characters) and the same behavior.

  • Inheritance. A class can inherit operations and attributes from a parent class, reusing common features. A superclass is a generalization. A subclass overrides superclass features or adds new features, and is a specialization.

    Both of our coffee cups are instances of cup, which is a subclass of a more general class, “drinking vessel”. This more general class includes other subclasses like glassware and stemware.

    When we described the string data type, we put it into a broader context called sequence and emphasized the common features that all sequence types had. We also emphasized the unique features that defined the various subclasses of sequence. All of the sequence types have the [] operator to select an individual item. Only strings, however, had an upper() method function. Only lists had the append() method function.

  • Polymorphism. A general operation, named in a superclass, can have different implementations in the various subclasses. We saw this when we noted that almost every class on Python has a + operation. Between two floating-point numbers the + operation adds the numbers, between two lists, however, the + operation concatenates the lists. Because objects of these distinct classes respond to a common operator, they are polymorphic.

Program Design. Up to this point in our programming career, we’ve been looking at our information needs and the available Python structures. If it was a temperature, we used a number; for the color of a space on the Roulette wheel, we used a string. In the case of something more complex, like a pair of dice, we used a function which created a tuple.

As we become more sophisticated, we begin to see that the various types of data that are built-in to Python aren’t exactly what we need. It isn’t possible to foresee all possible problems. Similarly, it isn’t possible to predict all possible kinds of data and processing that will be required to solve the unforeseeable problems. That’s why Python lets us define our own, brand-new types of data.

Class Definition. Python permits us to define our own classes of objects. This allows us to design an object that is an exact description of some part of our problem. We can design objects that reflect a pair of dice, a Roulette wheel, or the procedure for playing the game of Craps. A class definition involves a number of things.

  • The name of the new class.
  • An optional list of any classes that are the basis for this class definition. If there are any, we call these other classes the superclasses for our new class. Generally, we’ll use the class object as the superclass for our class definitions.
  • All of the method functions for this new class. Each method is, in effect, another function of this class. Defining a method function, is just like defining a function, and involves three things.
    • The name of the method function.
    • A list of zero or more parameters to this function. In order to identify the specific object instance, all method functions have one mandatory parameter.
    • A suite of statements for this method function.

The object’s attributes (also called instance variables) are not formally defined as part of the class. They are generally created by a special method function that is executed each time an object is created. This initialization method function is allocated responsibility for creating the object’s instance variables and assigning their initial values.

Object Creation. After we define the class, we can create instances of the class. Every object is in instance of one of more classes. Each object will have unique identity; it will have a distinct set of instance variables; it will be identified by a unique object identifier. Objects have an internal state, defined by the values assigned to the object’s instance variables. Additionally, each object has behavior based on the definitions of the method functions. An object is said to encapsulate a current state and a set of operations.

Because every object belongs to one or more defined classes, objects share a common definition of their attributes and methods. The class definition can also specify superclasses, which helps provide method functions. We can build a family tree of classes and share superclass definitions among a variety of closely-related subclasses.

It helps to treat each class definition as if the internal implementation details where completely opaque. A class should be considered as if it were a contract that specifies what the class does, but keeps private all of the details of how the class does it. All other objects within an application should use only the defined methods for interacting with an object. When we use a list’s append() method, we know what will happen, but we don’t know precisely how the list object adds the new item to the end of the list. Unlike Java and C++, Python has a relatively limited mechanism for formalizing this distinction between the defined interface and the private implementation of a class.

Life Cycle of an Object. Each object in our program has a lifecycle. The following is typical of most objects.

  • Definition. The class definition is read by the Python interpreter or it is built-in to the language. Class definitions are created by the class statement. Examples of built-in classes include files, strings, sequences, sets and mappings. We often collect our class statements into a file and import the class definitions to a program that will use them.
  • Construction. An object is constructed as an instance of a class: Python allocates memory that it will use for tracking the unique ID of the object, storing the instance variables, and associating the object with the class definition. An __init__() method function is executed to initialize the attributes of the newly created instance.
  • Access and Manipulation. The object’s methods are called (similar to function calls we covered in Better Arithmetic Through Functions) by client objects, functions or scripts. There is a considerable amount of collaboration among objects in most programs. Methods that report on the state of the object are sometimes called accessors; methods that change the state of the object are sometimes called manipulators.
  • Garbage Collection. Eventually, there are no more references to this instance. For example, consider a variable with an object reference which is part of the body of a function. When the function finishes, the variable no longer exists. Python detects this, and removes the object from memory, freeing up the storage for subsequent reuse. This freeing of memory is termed garbage collection, and happens automatically. See Garbage Collection for more information.

Important

Class and Instance

Once we’ve defined the class, we only use the class to make individual objects. Objects – instances of a class – do the real work of our program.

When we ask a string to create an upper case version of itself ("hi mom".upper()), we are asking a specific object ("hi mom") to do the work. We don’t ask the general class definition of string to do this. The meaning of str.upper() isn’t very clear.

This can be a little mystifying when we start to define our own classes. The problem usually stems from confusing class definitions with function definitions. We don’t use instances of a function for anything, we use the function itself. Functions, consequently, are a bad model of how class definition works. Classes are a kind of factory for creating objects. Objects do the real work.

The most important examples to keep in mind are string objects, file objects and list objects. These are the most typical examples of the kinds of objects we’ll create. Each string (or file or list) object is an instance of the respective class definition.

Under the hood, the definition of a class creates a new class object. This class object is used to create the instance objects that do the work of our program. The class object is mostly just a container for the suites of statements that define of the method functions of a class. Additionally, a class object can also own class-level variables; these are, in effect, shared by each individual object of that class. They become a kind of semi-global variable, shared by objects of a given class.

Object and Class Exercises

  1. Object Identification.

    When we evaluate an expression as simple as 3+5, Python creates an integer object with a value of 3, an integer object with a value of 5, then applies a method function to add these values, and create a new object which is the sum.

    Look at some of your earlier exercises in Arithmetic and Expressions and identify all of the objects in a given expression. Pay particular attention to each operator (like +, -, * or /) which will create a new object.

    Since () merely group expressions, do they create new objects?

  2. Iterator Objects.

    In While We Have More To Do : The for Statement we looked at the for statement. In Basic Sequential Collections of Data we looked at how the for statement iterates through a sequence. In Looping Back : Iterators, the for statement and Generators we looked at the iterator and how the for statement makes use of this iterator. A sequence has a method function (iter) which creates the iterator which yields each item of the sequence so the for statement can assign them to a variable.

    When you evaluate iter( [ 1,2,3] ), for example, you can see the iterator object being created. This iterator object has a cryptic-looking name, for example, <listiterator object at 0x107afd0>.

    Each time you evaluate something like iter( [ 1,2,3] ) you get a slightly different response. Does this indicate that a new object being created? Does this make sense? If each object can capture a unique state, does this mean each iterator is independent?

    If we create a single list, for example a=range(100), can we have multiple iterators which provide different views of the same list object, a?

  3. Temporary Objects in Functions.

    When a function is being evaluated, objects will be created by each operation in each expression. What happens to those objects? Does any object persist after the function’s evaluation is complete? What happens to the object created by (or named in) the return statement?

    Look at some of your earlier exercises in Organizing Programs with Function Definitions. Identify the life-span of all objects created by a specific function.

Class FAQ’s

Why define classes? Isn’t it simpler to have a group of related functions?

In a sense, a class is a group of related functions. However, the formal class definition allows you do some additional things that would be inconvenient with a group of related functions.

First, the class acts like a common name for the functions, saving you from have to write elaborate prefixes on your functions. For example, if you had a group of functions to work with a block of stocks, you might prefix each name with sb_ to assure that each function name was unique. This is error-prone and tedious.

Second, the class allows easy sharing of instance variables. All of the instance variables are collected under self, assuring that they are available to each function.

Third, and more important than these technical considerations, is the mental tools of abstraction and encapsulation. When we define classes, we encapsulate some functions so we can set aside the details. This allows us to reduce the complexity of a class to a somewhat simpler abstraction, namely, the functions that interface to the class, not the details of how the class works internally. Thinking at this more abstract level lets us wrap our finite brains around slightly larger problems.

Can I use something shorter than self ?
Yes and no. Yes, the language allows any variable name for the self variable. No, if you do, you will find it hard to share your Python programs with other people. The name self is very well established. A number of tools expect it. Most importantly, the other Python programmers from whom you may get software will expect it, also.
What is the advantage of having objects collaborate?

Our first programs tended to have a single, long procedure that made use of many objects. Once we’ve got the hang of programming, we can break that long procedure down into separate pieces, assign each piece to a method function of an object. Breaking long procedures down and allocating them to separate objects is part of our divide and conquer life-style.

As we get more proficient with object definition, we can examine a programming problem by writing a description and looking at the nouns and verbs in that description. The nouns will become objects, defined by classes. The verbs will become method functions. This collaboration among the nouns will fit naturally with the original description of the problem, giving us confidence that our program will work.

Table Of Contents

Previous topic

Data + Processing = Objects

Next topic

Defining New Objects

This Page