Peeking Under the Hood

This is some additional background in Python and programming. We’ll talk a little about what it means to execute the statements in a program and evaluate expressions in Execution – Two Points of View. We’ll provide some style notes in Expression Style Notes.

Currency amounts fall into the cracks between integers (no decimal places) and floating-point numbers (variable number of decimal places.) Currency requires a fixed digits after the decimal point. We’ll talk about one way to handle fixed point math in One Way To Tackle Fixed Point Math.

On a minimally-related topic, we’ll look more closely at the two division operators in The Two Specialized Division Operators: / and //.

Execution – Two Points of View

What does it mean when a computer “does” a specific task? This is the essential, inner mystery of programming. There are two overall approaches to specifying what should happen inside the computer. Most modern languages are a mixture of both approaches. These two approaches are sometimes called functional and procedural, or applicative and imperative. Since the programming language business is very competitive, any term we chose is loaded with meaning and many hairs get split in these conversations. We’ll look at both the applicative and imperative views of Python, because Python uses each approach where it is appropriate.

Applicative Approach. Functional or applicative programming is characterized by a style that looks a lot like conventional mathematics. Functions are applied to argument values using an evaluate-apply cycle.

We can see the applicative approach when we look, for example, at f = 32 + \frac{9c}{5}. We start wth “evaluate c” to get its current value (for example, 18); apply a multiply operation using 9 and the current value of c; apply a divide operation with the previous result (9c) and 5; apply an addition operation with 32 and the previous result (\frac{9c}{5}). The result is 64.4.

We call this process “expression evaluation”. We expect our programming language to apply math-like operations and functions using math-like rules: apply the parenthesized operations first, apply the high priority operations (like multiply and divide) in preference to low priority operations (like add and subtract).

Python has some sophisticated expression operators. Some of them transcend the simple add-subtract-multiple-divide category, and include operators that apply a function to a list to create a new list, apply a function to filter a list and apply a function to reduce a list to a single value.

When we evaluate a function like abs(-4), we name the -4 an argument to the function abs(). When looking at 3+4, we could consider 3 and 4 to be argument values to the +() function. We could – hypothetically – imagine rewriting 3+4 to be +(3,4) just to show what it really means.

Imperative Approach. On the other hand, the imperative style is characterized by using a sequential list of individual statements. Donald Knuth, in his Art of Computer Programming [Knuth73], shows a language he calls Mix. It is a purely imperative language, and is similar to the hardware languages used by many computer processor chips.

The imperative style lists a series of commands that the machine will execute. Each command changes the value of a register in the central processor, or changes the value of a memory location. In the following example, each line contains the abbreviation for a command and a reference to a memory location or a literal value. Memory locations are given names to make them easy to read. Literal values are surrounded by =. The following fragment uses a memory locations named C and F, as well as a processor register.

LDA C
MUL =9=
DIV =5=
ADD =32=
STA F

This first command loads the processor’s A register with the value at memory location C. The second command multiplies the register by 9. The third command divides the register by 5. The next command adds 32 to the register. The final command stores the contents of the A register into the memory location of the variable F.

Python. Python, like many popular languages, has elements drawn from both applicative and imperative realms. We’ll focus initially on expressions and expression evaluation, minimizing the imperative statements. We’ll then add various procedural statements to build up to the complete language.

The basic rule is that each statement is executed by first evaluating all of the expressions in that statement, then performing the statement’s task. The evaluation of each expression is done by evaluating the parameters and applying the functions to the parameters.

This evaluate-apply rule is so important, we’ll repeat here so that you can photocopy this page and make a counted cross-stitch sampler to hang over your computer. Yes, it’s that important.

Important

The Evalute-Apply Rule

Each statement is executed by (1) evaluating all of the expressions in that statement, then (2) performing the statement’s task.

The evaluation of an expression is done by (1a) evaluating all parameters and (1b) applying the function to the parameters.

Example: (2+3)*4, evaluates two parameters: 2+3 and 4, and applies the function *. In order to evaluate 2+3, there are two more parameters: 2 and 3, and a function of +.

While it may seem excessive to belabor this point, many programming questions arise from a failure to fully grasp this concept. We’ll return to it several times, calling it the evaluate-apply cycle. For each feature of the language, we need to know what happens when Python does its evaluation. This is what we mean by the semantics of a function, statement or object.

Another Imperative Example. Here’s another example of the imperative style of programming. This style is characterized by using a sequential list of individual statements. This imperative language is used internally by Python.

In the following example, each line contains an offset, the abbreviation for a command and a reference to a variable name or a literal value. Variable names are resolved by Python’s namespace rules. The following fragment uses a variable named c.

2           0 LOAD_FAST                0 (c)
            3 LOAD_CONST               1 (9)
            6 BINARY_MULTIPLY
            7 LOAD_CONST               2 (5)
           10 BINARY_DIVIDE
           11 LOAD_CONST               3 (32)
           14 BINARY_ADD

This first command (at offset 0) pushes the object associated with variable named c on the top of the arithmetic processing stack. The second command (at offset 3) loads the constant 9 on the top of the stack. The third command (at offset 6) multiplies the top two values on the stack. This leaves a new value on the top of the stack.

The fourth command (at offset 7) pushes a constant 5 onto the stack. The fifth command (at offset 10) performs a divsion operation between the top two values on the stack.

The sixth command (at offset 11) pushes a constant 32 onto the stack. Finally, the sixth command performances an add operation between the top two values on the stack.

Expression Style Notes

There is considerable flexibility in the language; two people can arrive at different presentations of Python source. Throughout this book we will present the guidelines for formatting, taken from the Python Enhancement Proposal (PEP) 8, posted on http://www.python.org/dev/peps/pep-0008/.

Python programs are meant to be readable. The language borrows a lot from common mathematical notation and from other programming languages. Many languages (C++ and Java) for instance, don’t require any particular formatting; line breaks and indentation become merely conventions; bad-looking, hard-to-read programs are possible. Python makes the line breaks and indentations part of the language, forcing you to create programs that are easier on the eyes.

Spaces are used sparingly in expressions. Spaces are never used between a function name and the ()’s that surround the arguments. It is considered poor form to write:

int (22.0/7)

Instead, we prefer:

int(22.0/7.0)

A long expression may be broken up with spaces to enhance readability. For example, the following separates the multiplication part of the expression from the addition part with a few wisely-chosen spaces.

b**2 - 4*a*c

One Way To Tackle Fixed Point Math

In Floating-Point Numbers, Also Known As Scientific Notation, we saw that floating-point numbers are for scientific and engineering use and don’t work well for financial purposes. US dollar calculations, for example, are often done in dollars and cents, with two digits after the decimal point.

If we try to use floating-point numbers for dollar values, we have problems. Specifically, the slight discrepancy between binary-coded floating-point numbers and decimal-oriented dollars and cents become a serious problem. Try this simple experiment.

>>> 2.35
2.3500000000000001

There’s a classic trick that can be used to solve this problem: use scaled numbers. When doing dollars and cents math, you can scale everything by 100, and do the math in pennies. When you print the final results, you can scale the final result into dollars with pennies to the right of the decimal point. This section will provide you some pointers on doing this kind of numeric programming.

Later, in Fixed-Point Numbers : Doing High Finance with decimal we’ll look at the decimal module, which does this in a more sophisticated and flexible way.

Scaled Numbers. When we use scaled numbers, it means that the proper value is represented as the scaled value and a precision factor. For example, if we are doing our work in pennies, the value of $12.99 is represented as a scaled value of 1299 with a precision of 2 digits. The precision factor can be thought of as a power of 10. In our case of 12.99, our precision is 2. We can multiply by 10 -precision to convert our scaled number into a floating-point approximation.

We have three cases to think about when doing fixed-point math using scaled integers: addition (and subtraction), multiplication and division. Addition and subtraction don’t change the precision. Multiplication increases the precision of the result and division reduces the precision. So, we’ll need to look at each case carefully.

Addition and Subtraction. If our two numbers have the same precision, we can simply add or subtract normally. This is why we suggest doing everything in pennies: the precisions are always 2, which always match. If our two numbers have different precisions, we need to shift the smaller precision number. We do this by multiplying by an appropriate power of 10.

What is $12.00 + $5.99? Assume we have 12 (the precision is dollars) and 599 (the precision is pennies). We add them like this: 12*100 + 599. We applied the penny precision factor of 100 to transform dollars into pennies.

Multiplication. When we multiply two numbers, the result has the sum of the two precisions. If we multiply two amounts in pennies (2 digits to the right of the decimal point), the result has 4 digits of precision. We have to be careful when doing this kind of math to determine the rounding rules, and correctly scale the result.

What is 7.5% of $135.99? Assume we have 13599 (the precision is pennies, 2 digits after the decimal point) and 75 (the precision is 10th of a percent, three digits to the right of the decimal point). When we multiply, our result will have precision of 5 digits to the right of the decimal point. The result (1019925) represents $10.19925. We need to both round and shift this back to have a precision of 2 digits to the right of the decimal point.

We can both round and scale with an expression like the following. The *.001 resets the scale from 5 digits of precision to 2 digits of precision.

>>> int(round(13599L*75,-3)*.001)
1020

This means 7.5% of $135.99 is $10.20.

Division. When we divide two numbers, the result’s precision is the difference between the numerator’s and denominator’s precision. If we divide two amounts in pennies (2 digits of precision), the result has zero digits of precision. Indeed, the result is actually a ratio between penny amounts, and isn’t actually in pennies. We have to be careful when doing this kind of math to determine the rounding rules, and correctly scale the answer.

Generally, if we want 2 digits of precision in our result, we need to be sure the numerator’s precision is at least 2 digits more than the denominator’s precision. This means scaling the numerator first, then doing the division. If the numerator has too much precision to begin with, we’ll have to round and then scale the result after division.

Say we have a bill of $45,276 for 416.15 hours of labor. What is the exact dollars per hour to the penny? Our hours have a precision of two digits, 41615, with a precision factor of 100. We need our dollars to start with five digits of precision because we start with two digits, we’ll lose two when dividing by hours, and we want one more digit so we can round properly. We’ll represent the dollars as 4527600000 with a precision factor of 100000. The division gives us 108797, with a precision factor of 1000. This can be rounded correctly and divided by 10 to get the value to the penny, properly rounded, of 10880, which means $108.80.

>>> int(round(45276L*100000/41615,-1)*.1)
10880

This meas that the labor rate was $108.80 per hour.

The Bigger Picture. Whew! It looks like the special cases of adding (and subtracting), multiplying and dividing are really complex. Actually, they aren’t too bad, they’re just new to you.

There’s a trick to this, and the trick is to begin with the goal in mind and work forward to what data we need to satisfy our goal. For adding and subtracting, our goal precision can’t be different from our input precision. When multiplying and dividing, we work backwards: we write down our goal precision, we write down the precision from our calculation, and we work out rounding and scaling operations to get from our calculation to our goal.

It turns out that this trick is essential to programming. We’ll return to it time and again.

The Two Specialized Division Operators: / and //

Python 2 harbors an assumption that – it turns out – is a bad idea. Python 3 will fix this by removing the assumption.

While most features of Python correspond with common expectations from mathematics and other programming languages, the division operator, /, has certain complexities. This is due to the lack of a common expectation for what division should mean. In a mathematics text book, the author will provide additional explanations to clarify the precise meaning of an operator. In a Python program, also, we need to clarify the precise meaning of the / operator.

A basic tenet of Python is that the data determine the result of an operation. For example, when we say 2+3, both numbers are plain integers, and the result is expected to be a plain integer. When we say 2+3.14, Python will coerce the 2 to be the mathematically equivalent 2.0; now both numbers are floating-point, and the answer can be a floating-point number.

This rule meets most of our expectations for ordinary math. However, this doesn’t work out well for division because there are two different, conflicting expectations:

  • Sometimes we expect division to create precise answers, usually the floating-point equivalents of fractions.
  • Other times, we want a rounded-down integer result.

There’s no best answer and no real compromise. Sometimes we mean one and other times we mean the other. We need both kinds of division operations.

In Python 2, the going-in assumption is “data determines the answer,” which just doesn’t work for division.

In Python 3, this assumption will be removed. You can then specify which sense of division you meant.

To see the effect of this assumption, try the following to see what Python does.

355/113
355.0/113
355/113.0
55.0/113.0

The Unexpected Integer. Here are two examples of the classical definition of division. We’ve used the formula for converting 18 ° Celsius to Fahrenheit. The first version uses integers, and gets an integer result. The second uses floating-point numbers, which means the result is floating-point.

>>> 18*9/5+32
64
>>> 18.0*9.0/5.0 + 32.0
64.400000000000006

In the first example, we got an inaccurate answer from a formula that we are sure is correct. We expected a correct answer of 64.4, but got 64.

In Python 2, when a formula has a / operator, the inaccuracy will stem from the use of integers where floating-point numbers were more appropriate. (This can also occur using integers where complex numbers were implicitly expected.)

If we use floating-point numbers, we get a value of 64.4, which was correct. Try this and see.

18.0*9.0/5.0 + 32.0

The Problem. The problem we have is reconciling the basic rule of Python (data determines the result) and the two conflicting meanings for division. We have a couple of choices for the solution.

We can solve this by using explicit conversions like float() or int(). However, we’d like Python be a simple and sparse language, without a dense clutter of conversions to cover the rare case of an unexpected type of data. So this isn’t ideal.

Instead, Python offers us two division operators.

  • For precise fractional results, the / will work nicely.
  • When we want division to simply compute the quotient, Python has a second division operator, //. This produces rounded-down integer answers, even if both numbers happen to be floating-point.

Old vs. New Division. While usiung Python 2, we need to specify which meaning of / should apply. Do we mean the original Python 2 definition (data type determines results)? Or do we mean the newer Python 3 meaning of / (exact results)?

Python 2 gives us two tools to specify the meaning of the / operator: a statement that can be placed in a program, as well as a command-line option that can be used when starting the Python program.

Program Statements to Control /. To ease the transition from older to newer language features, the statement from __future__ import division will changes the definition of the / operator from Python 2 (depends on the arguments) to Python 3 (always produces floating-point).

Note that __future__ has two underscores before and after future. Also, note that this must be the first statement in a script.

Here’s the classic division:

>>> 18*9/5+32
64

Here’s the new division

>>> from __future__ import division
>>> 18*9/5+32
64.400000000000006
>>> 18*9//5+32
64
  1. We set the future definition of the / operator.
  2. This line shows the new use of the / operator to produce precise floating-point results, even if both arguments are integers.
  3. This line shows the // operator, which always produces rounded-down results.

The from __future__ statement states that your script uses the new-style floating-point division operator. This allows you to start writing programs with Python 2 that will work correctly with all future versions.

By Python 3, this import statement will no longer be necessary, and will have to be removed from the few modules that used them.

Tip

Debugging the from __future__ statement

There are two common spelling mistakes: omitting the double underscore from before and after __future__, and misspelling division.

  • If you get ImportError: No module named _future_, you misspelled __future__.
  • If you get SyntaxError: future feature :replaceable:`divinizing is not defined`, you misspelled division.

Command Line Options to Control /. Another tool to ease the transition is an option that we can use as part of the python command that starts the Python interpreter. This option can force a particular interpretation of the / operator or warn about incorrect use of the / operator.

The Python command-line option of -Q controls the meaning of the / operator.

  • If you run Python with -Qold, you get the classical Python 2 definition, where the / operator’s result depends on the arguments.
  • If you run Python with -Qnew, you get the new Python 3 definition, where the / operator’s result will be a precise floating-point fraction.

Here’s how it looks when we start Python with the -Qold option.

MacBook-5:~ slott$ python -Qold
Python 2.5.4 (r254:67917, Dec 23 2008, 14:57:27)
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 355/113
3
>>> 355./113.
3.1415929203539825
>>> 355.//113.
3.0
  1. Here is the python command with the -Qold option. This will set Python to do classical interpretation of the / operator.
  2. When we do old-style / division with integers, we get an integer result.
  3. When we do old-style / division with floating-point numbers, we get the precise floating-point result.
  4. When we do // division with floating-point numbers, we get the rounded-down result.

Here’s how it looks when we start Python with the -Qnew option.

MacBook-5:~ slott$ python -Qnew
Python 2.5.4 (r254:67917, Dec 23 2008, 14:57:27)
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 355/113
3.1415929203539825
>>> 355./113.
3.1415929203539825
>>> 355.//113.
3.0
  1. Here is the python command with the -Qnew option. This will set Python to do the new interpretation of the / operator.
  2. When we do new-style / division with integers, we get the precise floating-point result.
  3. When we do new-style / division with floating-point numbers, we get the precise floating-point result.
  4. When we do // division with floating-point numbers, we get the rounded-down result.

Why All The Options?. There are two cases to consider here.

If you have an old program, you may need use -Qold to force an old module or program to work the way it used to.

If you want to be sure you’re ready for Python 3, you can use the -Qnew to be sure that you always have the “exact quotient” version of / instead of the classical version.

Important

Debugging The -Q Option

If you misspell the -Q option you’ll see errors like the following. If so, check your spelling carefully.

MacBook-5:~ slott$ python -Qwhat
-Q option should be `-Qold', `-Qwarn', `-Qwarnall', or `-Qnew' only
usage: Python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Try `python -h' for more information.

If you get a message that includes Unknown option: -q, you used a lower-case q instead of an upper-case Q.

Table Of Contents

Previous topic

Special Ops : Binary Data and Operators

Next topic

Seeing Results : The print Statement

This Page