Wrapping and Packaging Our Solution

There is considerable overlap between a library module and a main program script. The significant difference should be embodied in a small piece of programming we’ll examine in Script or Library? The Main Program Switch. Once we’ve looked at that, we can talk about the remaining features of a complete command-line program in The Standard Command-Line Interface.

The material on the exec statement in BTW – The exec Statement is here for completeness. Logically, there’s a nice symmetry between import and exec. As a practical matter, however, we rarely need to use exec.

Script or Library? The Main Program Switch

Back in Thinking In Modules, and the Declaration of Dependence, we identified two general species of Python modules: “main program” scripts and library modules. Some Python files do the main work of a program, while other files provide the definitions of classes and functions.

The library vs. script distinction is part of our intent when designing a module; there’s no formal way to state this in Python. Library modules can do some processing while being imported; a main module can provide some definitions as well as the main script. While this distinction is informal, the overall intent should be clear: it either either provides definitions or knits definitions together to do useful work.

The biggest and most obvious distinction is that the main program is the file run by the user. This can be an icon the user double-clicked, or a command the user typed at a command prompt. In either case, a single Python file initiates the processing. This is what makes a given Python file the “main” application.

If you look at Python application programs, you’ll see that the name of the application almost always matches one of the file names. For example, the IDLE application is launched by a file named idle.py. This file contains the main part of the application. IDLE has numerous other files, which contain class and function definitions.

Program Varieties. There are several subspecies of programs. We touched on this concept in Files are the Plumbing of a Software Architecture.

In this book, we’ve focused exclusively on command-line interface (CLI) programs because they are simpler to create. A richly interactive Graphic User Interface (GUI) program is generally more complex to build. Further, the core functionality for a GUI is often easiest to develop and debug as a CLI program. Once you have the CLI program working, you can wrap it up with a GUI.

To some programmers it seems more logical to design the user experience of a GUI first, and get the windows, menus, and buttons to work first. “After all,”, they argue, “the user’s interaction is the most important part of the software.” As a practical matter, however, this doesn’t work out well. It turns out to be far better to get the essential data and processing defined and working first. Once this works reliably and correctly, it’s easy to add a GUI to an already working program.

What this usually means is that we have the following structure.

  • One more more modules that defines the essential work of the program. This is a “model” of the real world defined with Python objects.
  • We often write a command-line application script that imports the model.
  • We can also write a GUI application script that imports the model. This includes the graphical “view” and the “control” logic.

This clean separation between the modules that do the work and the modules that provide the user experience makes our life simpler in the long run because each side of the application can be focused on a particular part of the task.

We’ll return these “varieties” of main programs in Architectural Patterns – A Family Tree.

Evolution. Programs are built up from modules. In some cases, a program evolves as a series of modules. First, we start with something really basic. Then we write a module that imports our first module, and implements better input and output. Then we figure out how the optparse module works and we write a module which imports the second and adds a better CLI. Then we write a GUI in GTK, which imports all of our previous modules. At each step, we are building additional features around the original small core of data or processing.

Sometimes, we create a program using someone else’s complete program. We might expand on someone else’s program or we might be knitting two programs together to make something new.

In all of these cases, we will have modules which can be used as main programs, but are also absorbed into a larger and more complex program. Python gives us a very elegant mechanism for turning a main program into a module that can be imported into a larger program.

The __name__ variable. The global __name__ variable is the name of the currently executing module. It helps us determine if a module is the main module – the module being run by Python – or a library module being imported.

When the __name__ variable is equal to '__main__' this is the initial (or top-level or outermost) file is being processed. When a module is being imported, the __name__ variable is the name of the module being imported.

If a module is the main program, it must do the useful work. If it is a being imported, on the other hand, it is merely providing definitions to some other main program, and should do no work except provide class and function definitions.

You can type the following at the command line prompt in IDLE. If you want to experiment, create a file with just one line: print(__name__) and import this to see what it does.

>>> __name__
'__main__'

This __name__ variable allows a module to be used as both a main program and as a library for another program. This can be called the “main-import switch”, as it helps a module determine if it is the main program or it is an import into another main program. It gives us the ultimate flexibility to expand, refine and reuse our modules for a variety of purposes.

A main program script generally looks like the following.

#!/usr/bin/env python
"""Module docstring"""

import someModule

def main():
    *the real work*

if __name__ == "__main__":
    main()

Tip

Debugging the Main Program Switch

There are two sides to the main program switch. When a module is executed from the command line, you want it to do useful things. When a module is imported by another module, you want it to provide definitions, but not actually do anything.

Command-Line Behavior. If you get a NameError, you misspelled __name__. If, on the other hand, nothing seems to happen, then you may have misspelled "__main__".

Another common problem is providing all of the class and function definitions, but omitting the main script entirely. The class and def statements all execute silently. If there’s no main script to create the objects and call the functions, then nothing will happen.

Import Behavior. If things happen when you import a module, it’s missing the main program switch. When a module is evolving from main program to library that is used by a new main program, we sometimes leave the old main program in place.

The best way to handle the change from main program to library is to put the old main program into a function with a name like main(), and then put it the simple main program switch that calls this main() function when the module name is "__main__".

The Standard Command-Line Interface

The glitzy desktop applications from big-name companies like Apple and Microsoft are the most visible parts of our computer system. Many programs, however, have minimal user interaction. They are run from a command-line prompt, perform their function, and exit gracefully.

Almost all of the core GNU/Linux utilities ( cp, rm, mv, ln, ls, df, du, etc.) are programs that decode command-line parameters, perform their processing function and return a status code. Except for some explicitly interactive programs like editors ( ex, vi, emacs, etc.), the core elements of GNU/Linux are command-line programs that lack a glitzy GUI.

In a way, we do interact with programs like ls (Windows dir). When we run the commands from the command prompt, we provide options and operands (or “arguments”). The options begin with - (Windows uses /). The operands are not decorated with punctuation; usually they are file names, but could be permissions or user names.

For example, we might do an ls -s /usr, which provides an option of -s and an argument of /usr. (For Windows, an example is dir /o:s “C:\Documents and Settings”, which has an option of /o:s and an argument of "C:\Documents and Settings".)

When the program runs, we see two kinds of output, usually intermixed into one stream. We see the output plus any error messages. We can use some redirection operators like > to capture the output and send it to a file. We can use 2> to capture the errors and send them to a file.

This redirection is beyond the scope of this book, but is covered in all of the books on GNU/Linux programming.

Command-Line Interface (CLI) programs. There are two critical features that make a CLI program well-behaved. First, the program should accept parameters (options and arguments) in a standard manner. Second, the program should generally limit output to the standard output and standard error files created by the operating system. When any other files are written it must be by user request and possibly require interactive confirmation. It’s bad behavior to silently overwrite a file.

The standard handling of command-line parameters is given as 13 rules for UNIX commands, as shown in the intro section of UNIX man pages. These rules describe the program name (rules 1-2), simple options (rules 3-5), options that take argument values (rules 6-8) and operands (rules 9 and 10) for the program.

  1. The program name should be between two and nine characters. This is consistent with most file systems where the program name is a file name. In the Python environment, the program file is typically the program name plus an extension of .py. Example: python, idle.py.
  2. The program name should include only lower-case letters and digits. The objective is to keep names relatively simple and easy to type correctly. Mixed-case names and names with punctuation marks can introduce difficulties in typing the program name correctly.
  3. Option names should be one character long. This is difficult to achieve in complex programs. Often, options have two forms: a single-character short form and a multi-character long form. Example: ls -a, rm -i *.pyc.
  4. Single-character options are preceded by -. Multiple-character options are preceded by --. All options have a flag that indicates that this is an option, not an operand. Single character options, again, are easier to type, but may be hard to remember for new users of a program.
  5. Options with no arguments may be grouped after a single -. This allows a series of one-character options to be given in a simple cluster. Example ls -ldai clusters the -l, -d, -a and -i options.
  6. Options that accept an argument value use a space separator. The option arguments are not run together with the option. Without this rule, it might be difficult to tell a option cluster from an option with arguments. Example: cut -ds is an argument value of s for the -d option.
  7. The argument value to an option cannot be optional. If an option requires an argument value, presence of the option means that an argument value will follow. The option is already optional; having an optional argument doesn’t make much sense.
  8. Groups of option-arguments following an option must be a single word; either separated by commas or quoted. A space would mean another option or the beginning of the operands. Example: -d "9,10,56": three numbers separated by commas form the argument value for the -d option.
  9. All options must precede any operands on the command line. This basic principle assures a simple, easy to understand uniformity to command processing.
  10. The string -- may be used to indicate the end of the options. This is particularly important when any of the operands begin with - and might be mistaken for an option.
  11. The order of the options relative to one another should not matter. Generally, a program should absorb all of the options to set up the processing.
  12. The relative order of the operands may be significant. This depends on what the operands mean and what the program does. The operands are often file names, and the order in which the files are processed may be significant. Example: ls -l -a is the same as ls -a -l and ls -la.
  13. The operand - preceded and followed by a space character should only be used to mean standard input. This may be passed as an operand, to indicate that the standard input file is processed at this time. Example, cat file1 - file2 will process file1, standard input and file2 in that order.

Parsing Command-Line Options. These rules are handled by the getopt module, the optparse module and the sys.argv variable in the sys module.

Important

But Wait! This is fine GNU/Linux, but what about Windows?

Windows programmers have several choices. The most common solution is to use the UNIX rules. They are compatible with Windows, simple and – most important – standardized by POSIX. This means that your program will use the - character for options, where the Microsoft-supplied programs will use /. How often do you use the Microsoft-supplied programs?

Another choice is to extend the getopt or optparse modules to handle Windows punctuation rules. This would allow you to seamlessly fit with the Microsoft command-line programs.

And, of course, you can always write your own option parser that looks for arguments which begin with /.

The command line arguments used to start Python are put into the sys.argv variable of the sys module as a sequence of strings.

For example, when we run something like

python casinosim.py -g craps

The operating system (Linux or Windows) sees the python command and runs the Python interpreter, passing the remaining arguments to the Python interpreter as a list of strings: ["casinosim.py", "-g", "craps"].

The first operand to the Python interpreter is always the top-level script to run. Python sets __name__ to "__main__" and executes the file, casinosim.py. The other argument values are placed into sys.argv.

Overview of optparse. First, of course, we have to think about our main program and how we want to use it. Once we’ve figured out the arguments and options, we can then use optparse to transform the arguments in sys.argv into options and arguments our program can use.

The optparse module parses the command-line options in a three-step process.

  1. Create an empty parser.
  2. Define the options that this parser will handle.
  3. Parse the arguments. This gives you a tuple with two objects. One object has the options as attributes. The other object is a list of the arguments that followed the options.

Once we have the options and arguments, we can then do the real work of our program.

Parameter Parsing. Let’s say we polished up some of our exercises to create a complete program with the following synopsis.

portfolio.py  -v  -h  -d  mm/dd/yy  -s symbol file

This program has the following options

-v

Verbosity. This can be repeated to increase the detail of the logging.

-h

Help. Provides a summary of portfolio.py.

-d mm/dd/yy

A particular sale date at which to evaluate the portfolio.

-s symbol

A particular symbol to select from the portfolio.

file

The name of a file with the portfolio data in CSV format.

These options can be processed as follows:

import optparse

parser= optparse.parser()
# -h automtically added by default
parser.add_option( "-v", action="count", dest="verbosity" )
parser.add_option( "-d", action="store", dest="date" )
parser.add_option( "-s", action="store", dest="symbol" )
options, filenames = parser.parse()

# options.verbosity is the count of -v options
# options.date is a string that must be further parsed
# options.symbol is a symbol string
# filenames is a list of files to process

Often, this option processing is packaged into a function called main().

Formal Definitions. Here are some formal definitions for parts of optparse.

optparse.parser(...) → Parser

Create a parser with the default option of -h and --help that provides help on the command.

You can also override the program name, version number, usage text and description that optparse will deduce from the context in which it’s run.

You can provide the argument value of add_help_option=False to suppress creating the -h and --help options.

If you provide version="someString", this will automatically add a --version option that displays the version number.

class optparse.Parser
parser.add_option(option_string, action, ...)

Add an option to the parser. You can provide any combination of short or long option strings of the form "-o" or "--option". You must provide at least one, you can provide both.

The keyword parameter, action is essential for determining what is to be done with that option.

Most parameters have a dest which is the destination attribute of the options object that gets created.

You’ll define the option with a collection of keyword arguments. There are a number of common cases.

  • Positive Flags. add_option( "-f", "--flag", action="store_true", dest="flag", default=False ) In this case option.flag will be created and set to True.
  • Negative Flags. add_option( "-f", "--flag", action="store_false", dest="flag", default=True ) In this case option.flag will be created and set to False.
  • Options with String Values. add_option( "-s", "--string", action="store", dest="option", type="string") In this case option.string will be created and set to the value of the -s option.
  • Options with Numeric Values. add_option( "-i", "--int", action="store", dest="option", type="int") In this case option.int will be created and set to the value of the -i option.
  • Options that are Objects. add_option( "-c", "--command", action="store_const", const=SomeObject, dest="command") In this case option.command will be created and set to the value SomeObject.
  • Options that are Counted. add_option( "-v", "--verbose", action="count", dest="verbosity") In this case option.verbosity will be created and set to the number of -v options present.
parser.parse() → options, arguments

Parse the sys.argv options and arguments, creating an options object with all of the options and an arguments list with all of the argument strings.

An Example Program

Let’s look at a simple, but complete program file. The program simulates several dice throws. We’ve decided that the command-line synopsis should be:

dicesim.py  -v  -s samples

The -v option leads to verbose output, where every individual toss of the dice is shown. Without the -v option, only the summary statistics are shown. The -s option tells how many samples to create. If this is omitted, 100 samples are used.

Here is the entire file. This program has a five-part design pattern that we’ve grouped into three sections.

dicesim.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env python
"""dicesim.py

Synopsis:
    dicesim.py [-v] [-s samples]
-v is for verbose output (show each sample)
-s is the number of samples (default 100)
"""

from __future__ import print_function, division
import dice
import optparse

def dicesim( samples=100, verbose=0 ):
    d= dice.Dice()
    t= 0
    for s in range(samples):
        n= d.roll()
        if verbose: print(n)
        t += n
    print("{0:d} samples, average is {1:f}".format( samples, t/samples ))

def main():
    parser= optparse.parser()
    parser.add_option( "-v", "--verbose", action="count", dest="verbosity" )
    parser.add_option( "-s", "--samples", action="store", type="int", dest="samples" )
    parser.set_defaults( verbosity=0, samples=100 )
    options, args = parser.parse()
    dicesim( options.samples, options.verbosity )

if __name__ == "__main__":
    main()
  1. Docstring. The docstring provides the synopsis of the program, plus any other relevant documentation. This should be reasonably complete. Each element of the documentation is separated by blank lines. Several standard document extract utilities expect this kind of formatting.
  1. Imports. The imports line lists the other modules on which this program depends. Each of these modules might have the main-import switch and a separate main program. Our objective is to reuse the imported classes and functions, not the main function.
  1. Actual Processing. This is the actual heart of the program. It is a pure function with no dependencies on a particular operating system. It can be imported by some other program and reused.
  1. Argument Decoding in Main. This is the interface between the operating system that initiates this program and the actual work in dicesym. This does not have much reuse potential.
  1. Main Import Switch. This makes the determination if this is a main program or an import. If it is an import, then __name__ is not "__main__", and no additional processing happens beyond the definitions. If it is the main program, then __name__ is "__main__"; the arguments are parsed by the function main(), which calls dicesym() to do the real work.

This is a typical layout for a complete Python main program. We strive for two objectives. First, keep the main() program focused; second, provide as many opportunities for reuse as possible.

Main Program Exercises

  1. Create Programs.

    Refer back to exercises in Arithmetic and Expressions. See sections Expression Exercises, Condition Exercises, For Statement Exercises, While Statement Exercises, Function Exercises. Modify these scripts to be stand-alone programs. In particular, they should get their input via optparse() from the command line instead of raw_input() or other mechanism.

  2. Larger Programs.

    Refer back to exercises in Basic Sequential Collections of Data. See sections String Exercises, Tuple Exercises, List Exercises, Dictionary Exercises, Exception Exercises. Modify these scripts to be stand-alone programs. In many cases, these programs will need input from files. The file names should be taken from the command line using optparse().

  3. Object-Oriented Programs.

    Refer back to exercises in Class Definition Exercises. Modify these scripts to be stand-alone programs.

BTW – The exec Statement

The import statement, in effect, executes the module file. Typically, a library-oriented module is a simple sequences of definitions. The import statement executes all of those definitions. It also creates a module object. Different variations on import add to this by introducing different names into the global namespace.

Further, Python also optimizes the modules brought in by the import statement so that they are only imported once.

The exec statement is similar to import, except it does not create a module object. Consequently, it doesn’t do any optimization to execute a module file just once.

The exec statement executes a suite of Python statements.

exec  expression

The expression can be an open file (created with the open() function), a string value which contains Python language statements, as well as a code object created by the compile() function.

Additionally, this form of the exec statement executes in a given namespace.

exec  expression in namespace

The namespace is a dictionary what will be used for any global variables created by the statements executed.

>>> code="""a= 3
... b= 5
... c= a*b
... """
>>> a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> exec code in results
>>> results['a']
3
>>> results['c']
15

The functions eval() and execfile() do almost similar things.

Warning

warning

These are potentially dangerous tools. These break something we call the Fundamental Assumption: the source you are reading is the source that is being executed. A program that uses the exec statement or eval() function is incorporating other source statements into the program dynamically. This can be hard to follow, maintain or enhance.

Generally, the exec statement is something that should be avoided. There are almost always more suitable solutions that involve extensible class design patterns.

Table Of Contents

Previous topic

Fit and Finish: Complete Programs

Next topic

Architectural Patterns – A Family Tree

This Page