Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 302 server: nginx date: Thu, 21 Aug 2025 09:55:04 GMT content-type: text/plain; charset=utf-8 content-length: 0 x-archive-redirect-reason: found capture at 20080212062344 location: https://web.archive.org/web/20080212062344/https://www.oreilly.com/catalog/lpython2/toc.html server-timing: captures_list;dur=0.571907, exclusion.robots;dur=0.017319, exclusion.robots.policy;dur=0.008341, esindex;dur=0.009816, cdx.remote;dur=52.430110, LoadShardBlock;dur=319.196574, PetaboxLoader3.datanode;dur=59.919052, PetaboxLoader3.resolve;dur=55.144025 x-app-server: wwwb-app212 x-ts: 302 x-tr: 394 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 set-cookie: wb-p-SERVER=wwwb-app212; path=/ x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() HTTP/2 200 server: nginx date: Thu, 21 Aug 2025 09:55:05 GMT content-type: text/html x-archive-orig-date: Tue, 12 Feb 2008 06:23:44 GMT x-archive-orig-server: Apache x-archive-orig-p3p: policyref="https://www.oreillynet.com/w3c/p3p.xml",CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAa IVDa CONo OUR DELa PUBi OTRa IND PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE" x-archive-orig-last-modified: Wed, 06 Feb 2008 06:59:16 GMT x-archive-orig-accept-ranges: bytes x-archive-orig-content-length: 901543 x-archive-orig-x-cache: MISS from oregano.bp x-archive-orig-x-cache-lookup: MISS from oregano.bp:3128 x-archive-orig-via: 1.0 oregano.bp:3128 (squid/2.6.STABLE12) x-archive-orig-connection: close x-archive-guessed-content-type: text/html x-archive-guessed-charset: utf-8 memento-datetime: Tue, 12 Feb 2008 06:23:44 GMT link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Sun, 07 Dec 2003 02:07:37 GMT", ; rel="prev memento"; datetime="Sat, 12 Jan 2008 07:06:51 GMT", ; rel="memento"; datetime="Tue, 12 Feb 2008 06:23:44 GMT", ; rel="next memento"; datetime="Wed, 30 Apr 2008 06:56:17 GMT", ; rel="last memento"; datetime="Sun, 26 Aug 2012 17:56:51 GMT" content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org x-archive-src: 51_2_20080212030351_crawl103-c/51_2_20080212062227_crawl100.arc.gz server-timing: captures_list;dur=0.703873, exclusion.robots;dur=0.023989, exclusion.robots.policy;dur=0.010645, esindex;dur=0.013307, cdx.remote;dur=6.444540, LoadShardBlock;dur=89.213420, PetaboxLoader3.datanode;dur=90.806864, PetaboxLoader3.resolve;dur=120.548522, load_resource;dur=174.710432 x-app-server: wwwb-app212 x-ts: 200 x-tr: 635 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() content-encoding: gzip O'Reilly Media | Learning Python

Buy this Book

Read it Now!

Reprint Licensing

-- Please select a chapter from the Table of Contents and click the button above to begin the licensing process.

Tell a friend

Learning Python, Second Edition

By Mark Lutz, David Ascher

Cover | Table of Contents | Colophon

Chapter 1: A Python Q&A Session

Content preview·Buy reprint rights for this chapter

If you've bought this book, you may already know what Python is, and why it's an important tool to learn. If not, you probably won't be sold on Python until you've learned the language by reading the rest of this book and have done a project or two. But before jumping into details, the first few pages briefly introduce some of the main reasons behind Python's popularity. To begin sculpting a definition of Python, this chapter takes the form of a question and answer session, which poses some of the most common non-technical questions asked by beginners.

Because there are many programming languages available today, this is the usual first question of newcomers. Given the hundreds of thousands of Python users out there today, there really is no way to answer this question with complete accuracy. The choice of development tools is sometimes based on unique constraints or personal preference.

But after teaching Python to roughly one thousand students and almost 100 companies in recent years, some common themes have emerged. The primary factors cited by Python users seem to be these:

Software quality: For many, Python's focus on readability, coherence, and software quality in general, sets it apart from "kitchen sink" style languages like Perl. Python code is designed to be readable, and hence maintainable—much more so than traditional scripting languages. In addition, Python has deep support for software reuse mechanisms such as object oriented programming (OOP).
Developer productivity: Python boosts developer productivity many times beyond compiled or statically typed languages such as C, C++, and Java. Python code is typically 1/3 to 1/5 the size of equivalent C++ or Java code. That means there is less to type, less to debug, and less to maintain after the fact. Python programs also run immediately, without the lengthy compile and link steps of some other tools.
Program portability: Most Python programs run unchanged on all major computer platforms. Porting Python code between Unix and Windows, for example, is usually just a matter of copying a script's code between machines. Moreover, Python offers multiple options for coding portable graphical user interfaces.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Do People Use Python?

Content preview·Buy reprint rights for this chapter

But after teaching Python to roughly one thousand students and almost 100 companies in recent years, some common themes have emerged. The primary factors cited by Python users seem to be these:

Software quality: For many, Python's focus on readability, coherence, and software quality in general, sets it apart from "kitchen sink" style languages like Perl. Python code is designed to be readable, and hence maintainable—much more so than traditional scripting languages. In addition, Python has deep support for software reuse mechanisms such as object oriented programming (OOP).
Developer productivity: Python boosts developer productivity many times beyond compiled or statically typed languages such as C, C++, and Java. Python code is typically 1/3 to 1/5 the size of equivalent C++ or Java code. That means there is less to type, less to debug, and less to maintain after the fact. Python programs also run immediately, without the lengthy compile and link steps of some other tools.
Program portability: Most Python programs run unchanged on all major computer platforms. Porting Python code between Unix and Windows, for example, is usually just a matter of copying a script's code between machines. Moreover, Python offers multiple options for coding portable graphical user interfaces.
Support libraries: Python comes with a large collection of prebuilt and portable functionality, known as the standard library. This library supports an array of application-level programming tasks, from text pattern matching, to network scripting. In addition, Python can be extended with both home-grown libraries, as well as a vast collection of third-party application support software.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Is Python a Scripting Language?

Content preview·Buy reprint rights for this chapter

Python is a general purpose programming language that is often applied in scripting roles. It is commonly defined as an object-oriented scripting language—a definition that blends support for OOP with an overall orientation toward scripting roles. In fact, people often use the word "script" instead of "program" to describe a Python code file. In this book, the terms "script" and "program" are used interchangeably, with a slight preference for "script" to describe a simpler top-level file and "program" to refer to a more sophisticated multifile application.

Because the term "scripting" has so many different meanings to different observers, some would prefer that it not be applied to Python at all. In fact, people tend to think of three very different definitions when they hear Python labeled a "scripting" language, some of which are more useful than others:

Shell tools: Tools for coding operating system-oriented scripts. Such programs are often launched from console command-lines, and perform tasks such as processing text files and launching other programs. Python programs can serve such roles, but this is just one of dozens of common Python application domains. It is not just a better shell script language.
Control language: A "glue" layer used to control and direct (i.e., script) other application components. Python programs are indeed often deployed in the context of a larger application. For instance, to test hardware devices, Python programs may call out to components that give low-level access to a device. Similarly, programs may run bits of Python code at strategic points, to support end-user product customization, without having to ship and recompile the entire system's source code. Python's simplicity makes it a naturally flexible control tool. Technically, though, this is also just a common Python role; many Python programmers code standalone scripts, without ever using or knowing about any integrated components.
Ease of use

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Okay, But What's the Downside?

Content preview·Buy reprint rights for this chapter

Perhaps the only downside to Python is that, as currently implemented, its execution speed may not always be as fast as compiled languages such as C and C++.

We'll talk about implementation concepts later in this book. But in short, the standard implementations of Python today compile (i.e., translate) source code statements to an intermediate format known as byte code, and then interpret the byte code. Byte code provides portability, as it is a platform-independent format. However, because Python is not compiled all the way down to binary machine code (e.g., instructions for an Intel chip), some programs will run more slowly in Python than in a fully compiled language like C.

Whether you will ever care about the execution speed difference depends on what kinds of programs you write. Python has been optimized numerous times, and Python code runs fast enough by itself in most application domains. Furthermore, whenever you do something "real" in a Python script, like process a file or construct a GUI, your program is actually running at C speed since such tasks are immediately dispatched to compiled C code inside the Python interpreter. More fundamentally, Python's speed-of-development gain is often far more important than any speed-of-execution loss, especially given modern computer speeds.

Even at today's CPU speeds there still are some domains that do require optimal execution speed. Numeric programming and animation, for example, often need at least their core number-crunching components to run at C speed (or better). If you work in such a domain, you can still use Python—simply split off the parts of the application that require optimal speed into compiled extensions , and link those into your system for use in Python scripts.

We won't talk about extensions much in this text, but this is really just an instance of the Python-as-control-language role that we discussed earlier. A prime example of this dual language strategy is the NumPy numeric programming extension for Python; by combining compiled and optimized numeric extension libraries with the Python language, NumPy turns Python into a numeric programming tool that is both efficient and easy to use. You may never need to code such extensions in your own Python work, but they provide a powerful optimization mechanism if you ever do.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Who Uses Python Today?

Content preview·Buy reprint rights for this chapter

At this writing, in 2003, the best estimate anyone can seem to make of the size of the Python user base is that there are between 500,000 and 1 million Python users around the world today (plus or minus a few). This estimate is based on various statistics like downloads and comparative newsgroup traffic. Because Python is open source, a more exact count is difficult—there are no license registrations to tally. Moreover, Python is automatically included with Linux distributions and some products and computer hardware, further clouding the user base picture. In general, though, Python enjoys a large user base, and a very active developer community. Because Python has been around for over a decade and has been widely used, it is also very stable and robust.

Besides individual users, Python is also being applied in real revenue-generating products, by real companies. For instance, Google and Yahoo! currently use Python in Internet services; Hewlett-Packard, Seagate, and IBM use Python for hardware testing; Industrial Light and Magic and other companies use Python in the production of movie animation; and so on. Probably the only common thread behind companies using Python today is that Python is used all over the map, in terms of application domains. Its general purpose nature makes it applicable to almost all fields, not just one. For more details on companies using Python today, see Python's web site at https://www.python.org.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

What Can I Do with Python?

Content preview·Buy reprint rights for this chapter

Besides being a well-designed programming language, Python is also useful for accomplishing real world tasks—the sorts of things developers do day in and day out. It's commonly used in a variety of domains, as a tool for both scripting other components and implementing standalone programs. In fact, as a general purpose language, Python's roles are virtually unlimited.

However, the most common Python roles today seem to fall into a few broad categories. The next few sections describe some of Python's most common applications today, as well as tools used in each domain. We won't be able to describe all the tools mentioned here; if you are interested in any of these topics, see Python online or other resources for more details.

Python's built-in interfaces to operating-system services make it ideal for writing portable, maintainable system-administration tools and utilities (sometimes called shell tools). Python programs can search files and directory trees, launch other programs, do parallel processing with processes and threads, and so on.

Python's standard library comes with POSIX bindings, and support for all the usual OS tools: environment variables, files, sockets, pipes, processes, multiple threads, regular expression pattern matching, command-line arguments, standard stream interfaces, shell-command launchers, filename expansion, and more. In addition, the bulk of Python's system interfaces are designed to be portable; for example, a script that copies directory trees typically runs unchanged on all major Python platforms.

Python's simplicity and rapid turnaround also make it a good match for GUI (graphical user interface) programming. Python comes with a standard object-oriented interface to the Tk GUI API called Tkinter, which allows Python programs to implement portable GUIs with native look and feel. Python/Tkinter GUIs run unchanged on MS Windows, X Windows (on Unix and Linux), and Macs. A free extension package, PMW, adds advanced widgets to the base Tkinter toolkit. In addition, the wxPython GUI API, based on a C++ library, offers an alternative toolkit for constructing portable GUIs in Python.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

What Are Python's Technical Strengths?

Content preview·Buy reprint rights for this chapter

Naturally, this is a developer's question. If you don't already have a programming background, the words in the next few sections may be a bit baffling—don't worry, we'll explain all of these in more detail as we proceed through this book. For de-velopers, though, here is a quick introduction to some of Python's top technical features.

Python is an object-oriented language, from the ground up. Its class model supports advanced notions such as polymorphism, operator overloading, and multiple inheritance; yet in the context of Python's simple syntax and typing, OOP is remarkably easy to apply. In fact, if you don't understand these terms, you'll find they are much easier to learn with Python than with just about any other OOP language available.

Besides serving as a powerful code structuring and reuse device, Python's OOP nature makes it ideal as a scripting tool for object-oriented systems languages such as C++ and Java. For example, with the appropriate glue code, Python programs can subclass (specialize) classes implemented in C++ or Java. Of equal significance, OOP is an option in Python; you can go far without having to become an object guru all at once.

Python is free. Just like other open source software, such as Tcl, Perl, Linux, and Apache, you can get the entire Python system for free on the Internet. There are no restrictions on copying it, embedding it in your systems, or shipping it with your products. In fact, you can even sell Python's source code, if you are so inclined.

But don't get the wrong idea: "free" doesn't mean "unsupported." On the contrary, the Python online community responds to user queries with a speed that most commercial software vendors would do well to notice. Moreover, because Python comes with complete source code, it empowers developers, and creates a large team of implementation experts. Although studying or changing a programming language's implementation isn't everyone's idea of fun, it's comforting to know that it's available as a final resort and ultimate documentation source. You're not dependent on a commercial vendor.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

How Does Python Stack Up to Language X?

Content preview·Buy reprint rights for this chapter

Finally, in terms of what you may already know, people sometimes compare Python to languages such as Perl, Tcl, and Java. We talked about performance earlier, so here the focus is on functionality. While other languages are also useful tools to know and use, we think that Python:

Is more powerful than Tcl. Python's support for "programming in the large" makes it applicable to larger systems development.
Has a cleaner syntax and simpler design than Perl, which makes it more readable and maintainable, and helps reduce program bugs.
Is simpler and easier to use than Java. Python is a scripting language, but Java inherits much of the complexity of systems languages such as C++.
Is simpler and easier to use than C++, but often doesn't compete with C++ either; as a scripting language, Python often serves different roles.
Is both more powerful and more cross-platform than Visual Basic. Its open source nature also means it is not controlled by a single company.
Has the dynamic flavor of languages like SmallTalk and Lisp, but also has a simple, traditional syntax accessible to developers and end users.

Especially for programs that do more than scan text files, and that might have to be read in the future by others (or by you!), we think Python fits the bill better than any other scripting language available today. Furthermore, unless your application requires peak performance, Python is often a viable alternative to systems development languages such as C, C++, and Java; Python code will be much less to write, debug, and maintain.

Of course, both of the authors are card-carrying Python evangelists, so take these comments as you may. They do, however, reflect the common experience of many developers who have taken time to explore what Python has to offer.

And that concludes the hype portion of this book. The best way to judge a language is to see it in action, so the next two chapters turn to a strictly technical introduction to the language. There, we explore ways to run Python programs, peek at Python's byte code execution model, and introduce the basics of module files for saving your code. Our goal will be to give you just enough information to run the examples and exercises in the rest of the book. As mentioned earlier, you won't really start programming until Chapter 4, but make sure you have a handle on the startup details before moving on.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 2: How Python Runs Programs

Content preview·Buy reprint rights for this chapter

This chapter and the next give a quick look at program execution—how you launch code, and how Python runs it. In this chapter, we explain the Python interpreter. Chapter 3 will show you how to get your own programs up and running.

Startup details are inherently platform-specific, and some of the material in this chapter may not apply to the platform you work on, so you should feel free to skip parts not relevant to your intended use. In fact, more advanced readers who have used similar tools in the past, and prefer to get to the meat of the language quickly, may want to file some of this chapter away for future reference. For the rest of you, let's learn how to run some code.

So far, we've mostly been talking about Python as a programming language. But as currently implemented, it's also a software package called an interpreter. An interpreter is a kind of program that executes other programs. When you write Python programs, the Python interpreter reads your program, and carries out the instructions it contains. In effect, the interpreter is a layer of software logic between your code and the computer hardware on your machine.

When the Python package is installed on your machine, it generates a number of components—minimally, an interpreter and a support library. Depending on how you use it, the Python interpreter may take the form of an executable program, or a set of libraries linked into another program. Depending on which flavor of Python you run, the interpreter itself may be implemented as a C program, a set of Java classes, or other. Whatever form it takes, the Python code you write must always be run by this interpreter. And to do that, you must first install a Python interpreter on your computer.

Python installation details vary per platform, and are covered in depth in Appendix A. In short:

Windows users fetch and run a self-installing executable file, which puts Python on their machine. Simply double-click and say Yes or Next at all prompts.
Linux and Unix users typically either install Python from RPM files, or compile it from its full source-code distribution package.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Introducing the Python Interpreter

Content preview·Buy reprint rights for this chapter

Python installation details vary per platform, and are covered in depth in Appendix A. In short:

Windows users fetch and run a self-installing executable file, which puts Python on their machine. Simply double-click and say Yes or Next at all prompts.
Linux and Unix users typically either install Python from RPM files, or compile it from its full source-code distribution package.
Other platforms have installation techniques relevant to that platform. For instance, files are synched on Palm Pilots.

Python itself may be fetched from the downloads page at Python's web site, www.python.org. It may also be found through various other distribution channels. You may have Python already available on your machine, especially on Linux and Unix. If you're working on Windows, you'll usually find Python in the Start menu, as captured in Figure 2-1 (we'll learn what these menu items mean in a moment). On Unix and Linux, Python probably lives in your /usr directory tree.

Figure 2-1: Python on the Windows Start menu

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Program Execution

Content preview·Buy reprint rights for this chapter

What it means to write and run a Python script depends on whether you look at these tasks as a programmer or as a Python interpreter. Both views offer important perspective on Python programming.

In its simplest form, a Python program is just a text file containing Python statements. For example, the following file, named script1.py, is one of the simplest Python scripts we could dream up, but it passes for an official Python program:

print 'hello world'
print 2 ** 100

This file contains two Python print statements, which simply print a string (the text in quotes) and a numeric expression result (2 to the power 100) to the output stream. Don't worry about the syntax of this code yet—for this chapter, we're interested only in getting it to run. We'll explain the print statement, and why you can raise 2 to the power 100 in Python without overflowing, in later parts of this book.

You can create such a file of statements with any text editor you like. By convention, Python program files are given names that end in ".py"; technically, this naming scheme is required only for files that are "imported," as shown later in this book, but most Python files have .py names for consistency.

After you've typed these statements into a text file in one way or another, you must tell Python to execute the file—which simply means to run all the statements from top to bottom in the file, one after another. Python program files may be launched by command lines, by clicking their icons, and with other standard techniques. We'll demonstrate how to invoke this execution in the next chapter. If all goes well, you'll see the results of the two print statements show up somewhere on your computer—by default, usually in the same window you were in when you ran the program:

hello world
1267650600228229401496703205376

For example, here's how this script ran from a DOS command line on a Windows laptop, to make sure it didn't have any silly typos:

D:\temp>python script1.py
hello world
1267650600228229401496703205376

We've just run a Python script that prints a string and a number. We probably won't win any programming awards with this code, but it's enough to capture the basics of program execution.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Execution Model Variations

Content preview·Buy reprint rights for this chapter

Before moving on, we should point out that the internal execution flow described in the prior section reflects the standard implementation of Python today, and is not really a requirement of the Python language itself. Because of that, the execution model is prone to change with time. In fact, there are already a few systems that modify the picture in Figure 2-2 somewhat. Let's take a few moments to explore the most prominent of these variations.

Really, as this book is being written, there are two primary implementations of the Python language—CPython and Jython—along with a handful of secondary implementations such as Python.net. CPython is the standard implementation; all the others have very specific purposes and roles. All implement the same Python language, but execute programs in different ways.

Section 2.3.1.1: CPython

The original, and standard, implementation of Python is usually called CPython, when you want to contrast it with the other two. Its name comes from the fact that it is coded in portable ANSI C language code. This is the Python that you fetch from www.python.org, get with the ActivePython distribution, and have automatically in most Linux machines. If you've found a preinstalled version of Python on your machine, it's probably CPython as well, unless your company is using Python in very specialized ways.

Unless you want to script Java or .NET applications with Python, you probably want to use the standard CPython system. Because it is the reference implementation of the language, it tends to run fastest, be the most complete, and be more robust than the alternative systems. Figure 2-2 reflects CPython's runtime architecture.

Section 2.3.1.2: Jython

The Jython system (originally known as JPython) is an alternative implementation of the Python language, targeted for integration with the Java programming language. Jython consists of Java classes that compile Python source code to Java byte code, and then route the resulting byte code to the Java Virtual Machine ( JVM). Programmers still code Python statements in .py

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 3: How You Run Programs

Content preview·Buy reprint rights for this chapter

Okay, it's time to start running some code. Now that you have a handle on program execution, you're finally ready to start some real Python programming. At this point, we'll assume that you have Python installed on your computer; if not, see Appendix A for installation and configuration hints.

There are a variety of ways to tell Python to execute the code you type. This chapter discusses all the program launching techniques in common use today. Along the way, you'll learn both how to type code interactively, and save it in files to be run with command lines, Unix tricks, icon clicks, IDEs, imports, and more.

If you just want to find out how to run a Python program quickly, you may be tempted to just read the parts that pertain to your platform and move on to Chapter 4. But don't skip the material on module imports, since that's essential to understanding Python's architecture. And we encourage you to at least skim the sections on IDLE and other IDEs, so you know what tools are available once you start developing more sophisticated Python programs.

Perhaps the simplest way to run Python programs is to type them at Python's interactive command line. There are a variety of ways to start this command line—in an IDE, from a system console, and so on. Assuming the interpreter is installed as an executable program on your system, the most platform-neutral way to start an interactive interpreter session is usually to type just "python" at your operating system's prompt, without any arguments. For example:

% python
Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

Here the word "python" is typed at your system shell prompt, to begin an interactive Python session (the "%" character stands for your system's prompt, not your input). The notion of a system shell prompt is generic, but varies per platform:

On Windows, you can type python in a DOS console window (a.k.a. Command Prompt), or the Start/Run... dialog box.
On Unix and Linux, you might type this in a shell window (e.g., in an

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Interactive Coding

Content preview·Buy reprint rights for this chapter

% python
Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

On Windows, you can type python in a DOS console window (a.k.a. Command Prompt), or the Start/Run... dialog box.
On Unix and Linux, you might type this in a shell window (e.g., in an xterm or console, running a shell such as ksh or csh).
Other systems may use similar or platform-specific devices. On PalmPilots, for example, click the Python home icon to launch an interactive session; on a Zaurus PDA, open a Terminal window.

If you have not set your shell's PATH environment variable to include Python, you may need to replace the word "python" with the full path to the Python executable on your machine. For instance, on Windows, try typing C:\Python22\python (or C:\Python23\python for Version 2.3); on Unix and Linux, /usr/local/bin/python (or /usr/bin/python) will often suffice.

Once the Python interactive session starts, it begins by printing two lines of informational text (which we normally omit in our examples to save space), and prompts for input with >>> when it's waiting for you to type a new Python statement or expression. When working interactively, the results of your code are displayed after the >>> lines—here are the results of two Python print statements:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

System Command Lines and Files

Content preview·Buy reprint rights for this chapter

Although the interactive prompt is great for experimenting and testing, it has one big disadvantage: programs you type there go away as soon as the Python interpreter executes them. The code you type interactively is never stored in a file, so you can't run it again without retyping it from scratch. Cut-and-paste and command recall can help some here, but not much, especially when you start writing larger programs. To cut and paste code from an interactive session, you have to edit out Python prompts, program outputs, and so on.

To save programs permanently, you need to write your code in files, usually known as modules. Modules are simply text files containing Python statements. Once coded, you can ask the Python interpreter to execute the statements in such a file any number of times, and in a variety of ways—by system command lines, by file icon clicks, by options in the IDLE user interface, and more. However they are run, Python executes all the code in a module file from top to bottom, each time you run the file. Such files are often referred to as programs in Python—a series of precoded statements.

The next few sections explore ways to run code typed into module files. In this section we run files in the most basic way: by listing their names in a python command line entered at a system prompt. As a first example, suppose we start our favorite text editor (e.g., vi, notepad, or the IDLE editor) and type three Python statements into a text file named spam.py:

print 2 ** 8                              # Raise to a power.
print 'the bright side ' + 'of life'      # + means concatenation.

This file contains two Python print statements and Python comments to the right. Text after a # is simply ignored as a human-readable comment, and is not part of the statement's syntax. Again, ignore the syntax of code in this file for now. The point to notice is that we've typed code into a file, rather than at the interactive prompt. In the process, we've coded a fully-functional Python script.

Once we've saved this text file, we can ask Python to run it by listing its full filename as a first argument on a python command, typed at the system shell's prompt:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Clicking Windows File Icons

Content preview·Buy reprint rights for this chapter

On Windows, Python automatically registers itself to be the program that opens Python program files when they are clicked. Because of that, it is possible to launch the Python programs you write by simply clicking (or double-clicking) on their file icons with your mouse.

On Windows, icon clicks are made easy by the Windows registry. On non-Windows systems, you will probably be able to perform a similar trick, but the icons, file explorer, navigation schemes, and more may differ slightly. On some Unix systems, for instance, you may need to register the .py extension with your file explorer GUI, make your script executable using the #! trick of the prior section, or associate the file MIME type with an application or command by editing files, installing programs, or using other tools. See your file explorer's documentation for more details, if clicks do not work correctly right off the bat.

To illustrate, suppose we create the following program file with our text editor, and save it as filename script4.py:

# A comment
import sys
print sys.platform
print 2 ** 100

There's not much new here—just an import and two prints again (sys.platform is just a string that identifies the kind of computer you're working on; it lives in a module called sys, which we must import to load). In fact, we can run this file from a system command line:

D:\OldVaio\LP-2ndEd\Examples>c:\python22\python script4.py
win32
1267650600228229401496703205376

Icon clicks allow us to run this file without any typing at all. If we find this file's icon—for instance, by selecting My Computer, and working our way down on the D drive—we will get the file explorer picture captured in Figure 3-1 and shown on Windows XP. Python source files show up as snakes on Windows, and byte code files as snakes with eyes closed (or with a reddish color in Version 2.3). You will normally want to click (or otherwise run) the source code file, in order to pick up your most recent changes. To launch the file here, simply click on the icon for script4.py.

Figure 3-1: Python file icons on Windows

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Imports and Reloads

Content preview·Buy reprint rights for this chapter

So far, we've been calling files of code "modules," and using the word "import," without explaining what these terms mean. We'll study modules and larger program architecture in depth in Part V. Because imports are also a way to launch programs, this section introduces enough module basics to get you started.

In simple terms, every file of Python code whose name ends in a .py extension is a module. Other files can access the items defined by a module by importing that module; import operations essentially load another file, and grant access to the file's contents. Furthermore, the contents of a module are made available to the outside world through its attributes , a term we'll define next.

This module-based services model turns out to be the core idea behind program architecture in Python. Larger programs usually take the form of multiple module files, which import tools from other module files. One of the modules is designated as the main or top-level file, and is the one launched to start the entire program.

We'll delve into such architecture issues in more detail later in this book. This chapter is mostly interested in the fact that import operations run the code in a file that is being loaded, as a final step. Because of this, importing a file is yet another way to launch it.

For instance, if we start an interactive session (in IDLE, from a command line, or otherwise), we can run the original script4.py file that appeared earlier with a simple import:

D:\LP-2ndEd\Examples>c:\python22\python
>>> import script4
win32
1267650600228229401496703205376

This works, but only once per session (really, process), by default. After the first import, later imports do nothing, even if we change and save the module's source file again in another window:

>>> import script4
>>> import script4

This is by design; imports are too expensive an operation to repeat more than once per program run. As we'll learn in Chapter 15, imports must find files, compile to byte code, and run the code. If we really want to force Python to rerun the file again in the same session (without stopping and restarting the session), we need to instead call the built-in

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The IDLE User Interface

Content preview·Buy reprint rights for this chapter

IDLE is a graphical user interface for doing Python development, and is a standard and free part of the Python system. It is usually referred to as an Integrated Development Environment (IDE), because it binds together various development tasks into a single view.

In short, IDLE is a GUI that lets you edit, run, browse, and debug Python programs, all from a single interface. Moreover, because IDLE is a Python program that uses the Tkinter GUI toolkit, it runs portably on most Python platforms: MS Windows, X Windows (Unix, Linux), and Macs. For many, IDLE is an easy-to-use alternative to typing command lines, and a less problem-prone alternative to clicking on icons.

Let's jump right into an example. IDLE is easy to start under Windows—it has an entry in the Start button menu for Python (see Figure 2-1); it can also be selected by right-clicking on a Python program icon. On some Unix-like systems, you may need to launch IDLE's top-level script from a command line or icon click—start file idle.pyw in the idle subdirectory of Python's Tools directory.

Figure 3-3 shows the scene after starting IDLE on Windows. The Python Shell window at the bottom is the main window, which runs an interactive session (notice the >>> prompt). This works like all interactive sessions—code you type here is run immediately after you type it—and serves as a testing tool.

Figure 3-3: IDLE main window and text edit window

IDLE uses familiar menus with keyboard shortcuts for most of its operations. To make (or edit) a script under IDLE, open text edit windows—in the main window, select the File menu pulldown, and pick New window to open a text edit window (or Open... to edit an existing file). The window at the top of Figure 3-3 is an IDLE text edit window, where the code for file script3.py was entered.

Although this may not show up fully in this book, IDLE uses syntax-directed colorization for the code typed in both the main window, and all text edit windows—keywords are one color, literals are another, and so on. This helps give you a better picture of the components in your code.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Other IDEs

Content preview·Buy reprint rights for this chapter

Because IDLE is free, portable, and a standard part of Python, it's a nice first development tool to become familiar with if you want to use an IDE at all. Use IDLE for this book's exercises if you're just starting out. There are, however, a handful of alternative IDEs for Python developers, some of which are substantially more powerful and robust than IDLE. Among the most commonly used today are these four:

Komodo: A full-featured development environment GUI for Python (and other languages). Komodo includes standard syntax-coloring text editing, debugging, and so on. In addition, Komodo offers many advanced features that IDLE does not, including project files, source-control integration, regular expression debugging, and a drag-and-drop GUI builder which generates Python/Tkinter code to implement the GUIs you design interactively. Komodo is not free as we write this; it is available at https://www.activestate.com.
PythonWorks: Another full-featured development environment GUI. PythonWorks also has standard IDE tools, and provides a Python/Tkinter GUI builder that generates Python code. In addition, it supports unique features such as automatic code refactoring, for optimal maintenance and reuse. This is also a commercial product; see https://www.pythonware.com for details.
PythonWin: A free IDE that ships as part of ActiveState's ActivePython distribution (and may also be fetchable separately from https://www.python.org resources). PythonWin is a Windows-only IDE for Python; it is roughly like IDLE, with a handful of useful Windows-specific extensions added in. For instance, PythonWin has support for COM objects. It also adds basic user interface features beyond IDLE, such as object attribute list popups. Further, PythonWin serves as an example of using the Windows extension package's GUI library. See https://www.activestate.com.
Visual Python: ActiveState also sells a system called Visual Python, which is a plug-in that adds Python support to Microsoft's Visual Studio development environment. This is also a Windows-only solution, but is appealing to developers with a prior intellectual investment in Visual Studio. See ActiveState's web site for details.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Embedding Calls

Content preview·Buy reprint rights for this chapter

So far, we've seen how to run code typed interactively, and how to launch code saved in files with command lines, icon clicks, IDEs, module imports, and Unix executable scripts. That covers most of the cases we'll see in this book.

But in some specialized domains, Python code may also be run by an enclosing system. In such cases, we say that Python programs are embedded in (i.e., run by) another program. The Python code itself may be entered into a text file, stored in a database, fetched from an HTML page, parsed from an XML document, and so on. But from an operational perspective, another system—not you—may tell Python to run the code you've created.

For example, it's possible to create and run strings of Python code from a C program by calling functions in the Python runtime API (a set of services exported by the libraries created when Python is compiled on your machine):

#include <Python.h>
...
Py_Initialize(  );
PyRun_SimpleString("x = brave + sir + robin");

In this C code snippet, a program coded in the C language embeds the Python interpreter by linking in its libraries, and passes it a Python assignment statement string to run. C programs may also gain access to Python objects, and process or execute them using other Python API tools.

This book isn't about Python/C integration, but you should be aware that, depending on how your organization plans to use Python, you may or may not be the one who actually starts the Python programs you create. Regardless, you can still likely use the interactive and file-based launching techniques described here, to test code in isolation from those enclosing systems that may eventually use it.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Frozen Binary Executables

Content preview·Buy reprint rights for this chapter

Frozen binary executables are packages that combine your program's byte code and the Python interpreter into a single executable program. With these, programs can be launched in the same ways that you would launch any other executable program (icon clicks, command lines, etc.). While this option works well for delivery of products, it is not really intended for use during program development. You normally freeze just before shipping, and after development is finished.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Text Editor Launch Options

Content preview·Buy reprint rights for this chapter

Many programmer-friendly text editors have support for editing, and possibly running, Python programs. Such support may be either built-in, or fetchable on the web. For instance, if you are familiar with the emacs text editor, you can do all your Python editing and launching from inside the text editor itself. See the text editor resources page at https://www.python.org/editors for more details.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Other Launch Options

Content preview·Buy reprint rights for this chapter

Depending on your platform, there may be additional ways that you can start Python programs. For instance, on some Macintosh systems, you may be able to drag Python program file icons onto the Python interpreter icon, to make them execute. And on Windows, you can always start Python scripts with the Run... option in the Start menu. Finally, the Python standard library has utilities that allow Python programs to be started by other Python programs (e.g., execfile, os.popen, os.system); however, these tools are beyond the scope of the present chapter.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Future Possibilities?

Content preview·Buy reprint rights for this chapter

Although this chapter reflects current practice, much of it has been both platform- and time-specific. Indeed, many of the execution and launch details presented arose between this book's first and second editions. As for program execution, it's not impossible that new program launch options may arise over time.

New operating systems, and new versions of them, may also provide execution techniques beyond those outlined here. In general, because Python keeps pace with such changes, you should be able to launch it in whatever way makes sense for the machines you use, both now and in the future—be that drawing on tablet PCs or PDAs, grabbing icons in a virtual reality, or shouting a script's name over your coworkers' conversations.

Implementation changes may also impact launch schemes somewhat (e.g., a full compiler could produce normal executables, launched much like frozen binaries today). If we knew what the future truly held, though, we would probably be talking to a stock broker instead of writing these words.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Which Option Should I Use?

Content preview·Buy reprint rights for this chapter

With all these options, the question naturally arises: Which one is best for me? In general, use the IDLE interface for development, if you are just getting started with Python. It provides a user-friendly GUI environment, and can hide some of the underlying configuration details. It also comes with a platform-neutral text editor for coding your scripts, and is a standard and free part of the Python system.

If instead, you are an experienced programmer, you might be more comfortable with simply the text editor of your choice in one window, and another window for launching the programs you edit, by system command lines or icon clicks. Because development environments are a very subjective choice, we can't offer much more in the way of universal guidelines; in general, the environment you like to use is usually the best to use.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part I Exercises

Content preview·Buy reprint rights for this chapter

It's time to start doing a little coding on your own. This first exercise session is fairly simple, but a few of these questions hint at topics to come in later chapters. Remember, check Section B.1 for the answers; the exercises and their solutions sometimes contain supplemental information not discussed in the main part of the chapter. In other words, you should peek, even if you can manage to get all the answers on your own.

Interaction. Using a system command line, IDLE, or other, start the Python interactive command line (>>> prompt), and type the expression: "Hello World!" (including the quotes). The string should be echoed back to you. The purpose of this exercise is to get your environment configured to run Python. In some scenarios, you may need to first run a cd shell command, type the full path to the python executable, or add its path to your PATH environment variable. If desired, you can set it in your .cshrc or .kshrc file to make Python permanently available on Unix systems; on Windows use a setup.bat, autoexec.bat, or the environment variable GUI. See Appendix A for help with environment variable settings.
Programs. With the text editor of your choice, write a simple module file—a file containing the single statement: print 'Hello module world!'. Store this statement in a file named module1.py. Now, run this file by using any launch option you like: running it in IDLE, clicking on its file icon, passing it to the Python interpreter program on the system shell's command line, and so on. In fact, experiment by running your file with as many of the launch techniques seen in this chapter as you can. Which technique seems easiest? (There is no right answer to this one.)
Modules. Next, start the Python interactive command line (>>> prompt) and import the module you wrote in Exercise 2. Try moving the file to a different directory and importing it again from its original directory (i.e., run Python in the original directory when you import); what happens? (Hint: is there still a file named

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 4: Numbers

Content preview·Buy reprint rights for this chapter

This chapter begins our tour of the Python language. In Python, data takes the form of objects—either built-in objects that Python provides, or objects we create using Python and C tools. Since objects are the most fundamental notion in Python programming, we'll start this chapter with a survey of Python's built-in object types before concentrating on numbers.

By way of introduction, let's first get a clear picture of how this chapter fits into the overall Python picture. From a more concrete perspective, Python programs can be decomposed into modules, statements, expressions, and objects, as follows:

Programs are composed of modules.
Modules contain statements.
Statements contain expressions.
Expressions create and process objects.

We introduced the highest level of this hierarchy when we learned about modules in Chapter 3. This part's chapters begin at the bottom, exploring both built-in objects, and the expressions you can code to use them.

If you've used lower-level languages such as C or C++, you know that much of your work centers on implementing objects—also known as data structures—to represent the components in your application's domain. You need to lay out memory structures, manage memory allocation, implement search and access routines, and so on. These chores are about as tedious (and error prone) as they sound, and usually distract from your programs' real goals.

In typical Python programs, most of this grunt work goes away. Because Python provides powerful object types as an intrinsic part of the language, there's no need to code object implementations before you start solving problems. In fact, unless you have a need for special processing that built-in types don't provide, you're almost always better off using a built-in object instead of implementing your own. Here are some reasons why:

Built-in objects make simple programs easy to write. For simple tasks, built-in types are often all you need to represent the structure of problem domains. Because you get things such as collections (lists) and search tables (dictionaries) for free, you can use them immediately. You can get a lot of work done with Python's built-in object types alone.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python Program Structure

Content preview·Buy reprint rights for this chapter

Programs are composed of modules.
Modules contain statements.
Statements contain expressions.
Expressions create and process objects.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Built-in Types?

Content preview·Buy reprint rights for this chapter

Built-in objects make simple programs easy to write. For simple tasks, built-in types are often all you need to represent the structure of problem domains. Because you get things such as collections (lists) and search tables (dictionaries) for free, you can use them immediately. You can get a lot of work done with Python's built-in object types alone.
Python provides objects and supports extensions. In some ways, Python borrows both from languages that rely on built-in tools (e.g., LISP), and languages that rely on the programmer to provide tool implementations or frameworks of their own (e.g., C++). Although you can implement unique object types in Python, you don't need to do so just to get started. Moreover, because Python's built-ins are standard, they're always the same; frameworks tend to differ from site to site.
Built-in objects are components of extensions. For more complex tasks you still may need to provide your own objects, using Python statements or C language interfaces. But as we'll see in later parts, objects implemented manually are often built on top of built-in types such as lists and dictionaries. For instance, a stack data structure may be implemented as a class that manages a built-in list.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Numbers

Content preview·Buy reprint rights for this chapter

The first object type on the tour is Python numbers. In general, Python's number types are fairly typical and will seem familiar if you've used almost any other programming language in the past. They can be used to keep track of your bank balance, the distance to Mars, the number of visitors to your web site, and just about any other numeric quantity.

Python supports the usual numeric types (known as integer and floating point), as well as literals for creating numbers, and expressions for processing them. In addition, Python provides more advanced numeric programming support, including a complex number type, an unlimited precision integer, and a variety of numeric tool libraries. The next few sections give an overview of the numeric support in Python.

Among its basic types, Python supports the usual numeric types: both integer and floating-point numbers, and all their associated syntax and operations. Like the C language, Python also allows you to write integers using hexadecimal and octal literals. Unlike C, Python also has a complex number type, as well as a long integer type with unlimited precision (it can grow to have as many digits as your memory space allows). Table 4-2 shows what Python's numeric types look like when written out in a program (that is, as literals).

Table 4-2: Numeric literals
Literal	Interpretation
`1234, -24, 0`	Normal integers (C longs)
`9999999999999999999L`	Long integers (unlimited size)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python Expression Operators

Content preview·Buy reprint rights for this chapter

Perhaps the most fundamental tool that processes numbers is the expression: a combination of numbers (or other objects) and operators that computes a value when executed by Python. In Python, expressions are written using the usual mathematical notation and operator symbols. For instance, to add two numbers X and Y, say X+Y, which tells Python to apply the + operator to the values named by X and Y. The result of the expression is the sum of X and Y, another number object.

Table 4-3 lists all the operator expressions available in Python. Many are self-explanatory; for instance, the usual mathematical operators are supported: +, -, *, /, and so on. A few will be familiar if you've used C in the past: % computes a division remainder, << performs a bitwise left-shift, & computes a bitwise and result, etc. Others are more Python-specific, and not all are numeric in nature: the is operator tests object identity (i.e., address) equality, lambda creates unnamed functions, and so on. More on some of these later.

Table 4-3: Python expression operators and precedence
Operators	Description
`lambda args: expression`	Anonymous function generation
`x or y`	Logical or (y is evaluated only if x is false)
`x and y`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Numbers in Action

Content preview·Buy reprint rights for this chapter

Probably the best way to understand numeric objects and expressions is to see them in action. So, start up the interactive command line and type some basic, but illustrative operations.

First of all, let's exercise some basic math. In the following interaction, we first assign two variables (a and b) to integers, so we can use them later in a larger expression. Variables are simply names—created by you or Python—that are used to keep track of information in your program. We'll say more about this later, but in Python:

Variables are created when first assigned a value.
Variables are replaced with their values when used in expressions.
Variables must be assigned before they can be used in expressions.
Variables refer to objects, and are never declared ahead of time.

In other words, the assignments cause these variables to spring into existence automatically.

% python
>>> a = 3           # Name created
>>> b = 4

We've also used a comment here. In Python code, text after a # mark and continuing to the end of the line is considered to be a comment, and is ignored by Python. Comments are a place to write human-readable documentation for your code. Since code you type interactively is temporary, you won't normally write comments there, but they are added to examples to help explain the code. In the next part of this book, we'll meet a related feature—documentation strings—that attaches the text of your comments to objects.

Now, let's use the integer objects in expressions. At this point, a and b are still 3 and 4, respectively; variables like these are replaced with their values whenever used inside an expression, and expression results are echoed back when working interactively:

                  >>> 
                  a + 1, a - 1        # Addition (3+1), subtraction (3-1)
(4, 2)
>>> 
                  b * 3, b / 2        # Multiplication (4*3), division (4/2)
(12, 2)
>>> 
                  a % 2, b ** 2       # Modulus (remainder), power

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Dynamic Typing Interlude

Content preview·Buy reprint rights for this chapter

If you have a background in compiled or statically-typed languages like C, C++, or Java, you might find yourself in a perplexed place at this point. So far, we've been using variables without declaring their types—and it somehow works. When we type a = 3 in an interactive session or program file, how does Python know that a should stand for an integer? For that matter, how does Python know what a even is at all?

Once you start asking such questions, you've crossed over into the domain of Python's dynamic typing model. In Python, types are determined automatically at runtime, not in response to declarations in your code. To you, it means that you never declare variables ahead of time, and that is perhaps a simpler concept if you have not programmed in other languages before. Since this is probably the most central concept of the language, though, let's explore it in detail here.

You'll notice that when we say

a =
3

, it works, even though we never told Python to use name a as a variable. In addition, the assignment of 3 to a seems to work too, even though we didn't tell Python that a should stand for an integer type object. In the Python language, this all pans out in a very natural way, as follows:

Creation: A variable, like a, is created when it is first assigned a value by your code. Future assignments change the already-created name to have a new value. Technically, Python detects some names before your code runs; but conceptually, you can think of it as though assignments make variables.
Types: A variable, like a, never has any type information or constraint associated with it. Rather, the notion of type lives with objects, not names. Variables always simply refer to a particular object, at a particular point in time.
Use: When a variable appears in an expression, it is immediately replaced with the object that it currently refers to, whatever that may be. Further, all variables must be explicitly assigned before they can be used; use of unassigned variables results in an error.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 5: Strings

Content preview·Buy reprint rights for this chapter

The next major built-in type is the Python string—an ordered collection of characters, used to store and represent text-based information. From a functional perspective, strings can be used to represent just about anything that can be encoded as text: symbols and words (e.g., your name), contents of text files loaded into memory, Internet addresses, Python programs, and so on.

You may have used strings in other languages too; Python's strings serve the same role as character arrays in languages such as C, but Python's strings are a somewhat higher level tool than arrays. Unlike C, Python strings come with a powerful set of processing tools. Also unlike languages like C, Python has no special type for single characters (like C's char), only one-character strings.

Strictly speaking, Python strings are categorized as immutable sequences—meaning that they have a left-to-right positional order (sequence), and cannot be changed in place (immutable). In fact, strings are the first representative of the larger class of objects called sequences. Pay special attention to the operations introduced here, because they will work the same on other sequence types we'll see later, such as lists and tuples.

Table 5-1 introduces common string literals and operations. Empty strings are written as two quotes with nothing in between, and there are a variety of ways to code strings. For processing, strings support expression operations such as concatenation (combining strings), slicing (extracting sections), indexing (fetching by offset), and so on. Besides expressions, Python also provides a set of string methods that implement common string-specific tasks, as well as a string module that mirrors most string methods.

Table 5-1: Common string literals and operations
Operation	Interpretation
`s1 = '`'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String Literals

Content preview·Buy reprint rights for this chapter

By and large, strings are fairly easy to use in Python. Perhaps the most complicated thing about them is that there are so many ways to write them in your code:

Single quotes: 'spa"m'
Double quotes: "spa'm"
Triple quotes: '''... spam ...''', ""carview.php?tsp="... spam ..."carview.php?tsp=""
Escape sequences: "s\tp\na\0m"
Raw strings: r"C:\new\test.spm"
Unicode strings: u'eggs\u0020spam'

The single- and double-quoted forms are by far the most common; the others serve specialized roles. Let's take a quick look at each of these options.

Around Python strings, single and double quote characters are interchangeable. That is, string literals can be written enclosed in either two single or two double quotes—the two forms work the same, and return the same type of object. For example, the following two strings are identical, once coded:

>>> 'shrubbery', "shrubbery"
('shrubbery', 'shrubbery')

The reason for including both is that it allows you to embed a quote character of the other variety inside a string, without escaping it with a backslash: you may embed a single quote character in a string enclosed in double quote characters, and vice-versa:

>>> 'knight"s', "knight's"
('knight"s', "knight's")

Incidentally, Python automatically concatenates adjacent string literals, although it is almost as simple to add a + operator between them, to invoke concatenation explicitly.

>>> title = "Meaning " 'of' " Life"
>>> title
'Meaning of Life'

Notice in all of these outputs that Python prefers to print strings in single quotes, unless they embed one. You can also embed quotes by escaping them with backslashes:

>>> 'knight\'s', "knight\"s"
("knight's", 'knight"s')

But to understand why, we need to explain how escapes work in general.

The last example embedded a quote inside a string by preceding it with a backslash. This is representative of a general pattern in strings: backslashes are used to introduce special byte codings, known as escape sequences.

Escape sequences let us embed byte codes in strings that cannot be easily typed on a keyboard. The character

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Strings in Action

Content preview·Buy reprint rights for this chapter

Once you've written a string, you will almost certainly want to do things with it. This section and the next two demonstrate string basics, formatting, and methods.

Let's begin by interacting with the Python interpreter to illustrate the basic string operations listed in Table 5-1. Strings can be concatenated using the + operator, and repeated using the * operator:

% python
>>> len('abc')         # Length: number items 
3
>>> 'abc' + 'def'      # Concatenation: a new string
'abcdef'
>>> 'Ni!' * 4          # Repitition: like "Ni!" + "Ni!" + ...
'Ni!Ni!Ni!Ni!'

Formally, adding two string objects creates a new string object, with the contents of its operands joined; repetition is like adding a string to itself a number of times. In both cases, Python lets you create arbitrarily sized strings; there's no need to predeclare anything in Python, including the sizes of data structures. The len built-in function returns the length of strings (and other objects with a length).

Repetition may seem a bit obscure at first, but it comes in handy in a surprising number of contexts. For example, to print a line of 80 dashes, you can either count up to 80 or let Python count for you:

>>> print '------- ...more... ---'      # 80 dashes, the hard way
>>> print '-'*80                        # 80 dashes, the easy way

Notice that operator overloading is at work here already: we're using the same + and * operators that are called addition and multiplication when using numbers. Python does the correct operation, because it knows the types of objects being added and multiplied. But be careful: this isn't quite as liberal as you might expect. For instance, Python doesn't allow you to mix numbers and strings in + expressions: 'abc'+9 raises an error, instead of automatically converting 9 to a string.

As shown in the last line in Table 5-1, you can also iterate over strings in loops using for statements and test membership with the in expression operator, which is essentially a search:

>>> myjob = "hacker"

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String Formatting

Content preview·Buy reprint rights for this chapter

Python overloads the % binary operator to work on strings (the % operator also means remainder-of-division modulus for numbers). When applied to strings, it serves the same role as C's sprintf function; the % provides a simple way to format values as strings, according to a format definition string. In short, this operator provides a compact way to code multiple string substitutions.

To format strings:

Provide a format string on the left of the % operator with embedded conversion targets that start with a % (e.g., "%d").
Provide an object (or objects in parenthesis) on the right of the % operator that you want Python to insert into the format string on the left at its conversion targets.

For instance, in the last example of the prior section, the integer 1 replaces the %d in the format string on the left, and the string 'dead' replaces the %s. The result is a new string that reflects these two substitutions.

Technically speaking, the string formatting expression is usually optional—you can generally do similar work with multiple concatenations and conversions. However, formatting allows us to combine many steps into a single operation. It's powerful enough to warrant a few more examples:

>>> exclamation = "Ni"
>>> "The knights who say %s!" % exclamation
'The knights who say Ni!'
>>> "%d %s %d you" % (1, 'spam', 4)
'1 spam 4 you'
>>> "%s -- %s -- %s" % (42, 3.14159, [1, 2, 3])
'42 -- 3.14159 -- [1, 2, 3]'

The first example here plugs the string "Ni" into the target on the left, replacing the %s marker. In the second, three values are inserted into the target string. When there is more than one value being inserted, you need to group the values on the right in parentheses (which really means they are put in a tuple). Keep in mind that formatting always makes a new string, rather than changing the string on the left; since strings are immutable, it must.

Notice that the third example inserts three values again—an integer, floating-point, and list object—but all of the targets on the left are %s, which stands for conversion to string. Since every type of object can be converted to a string (the one used when printing), every object works with the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String Methods

Content preview·Buy reprint rights for this chapter

In addition to expression operators, strings provide a set of methods that implement more sophisticated text processing tasks. Methods are simply functions that are associated with a particular object. Technically, they are attributes attached to objects, which happen to reference a callable function. In Python, methods are specific to object types; string methods, for example, only work on string objects.

Functions are packages of code, and method calls combine two operations at once—an attribute fetch, and a call:

Attribute fetches: An expression of the form object.attribute means "fetch the value of attribute in object."
Call expressions: An expression of the form function(arguments) means "invoke the code of function, passing zero or more comma-separated argument objects to it, and returning the function's result value."

Putting these two together allows us to call a method of an object. The method call expression object.method(arguments) is evaluated from left to right—Python will first fetch the method of the object, and then call it, passing in the arguments. If the method computes a result, it will come back as the result of the entire method call expression.

As you'll see throughout Part II, most objects have callable methods, and all are accessed using this same method call syntax. To call an object method, you have to go through an existing object; let's move on to some examples to see how.

Table 5-4 summarizes the call patterns for built-in string methods. They implement higher-level operations, like splitting and joining, case conversions and tests, and substring searches. Let's work through some code that demonstrates some of the most commonly used methods in action, and presents Python text-processing basics along the way.

Table 5-4: String method calls

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

General Type Categories

Content preview·Buy reprint rights for this chapter

Now that we've seen the first collection object, the string, let's pause to define a few general type concepts that will apply to most of the types from here on. In regard to built-in types, it turns out that operations work the same for all types in a category, so we only need to define most ideas once. We've only seen numbers and strings so far; but because they are representative of two of the three major type categories in Python, you already know more about other types than you think.

Strings are immutable sequences: they cannot be changed in place (the immutable part), and are positionally-ordered collections that are accessed by offsets (the sequence part). Now, it so happens that all the sequences seen in this part of the book respond to the same sequence operations shown at work on strings—concatenation, indexing, iteration, and so on. More formally, there are three type (and operation) categories in Python:

Numbers: Support addition, multiplication, etc.
Sequences: Support indexing, slicing, concatenation, etc.
Mappings: Support indexing by key, etc.

We haven't seen mappings yet (dictionaries are discussed in the next chapter), but other types are going to be mostly more of the same. For example, for any sequence objects X and Y:

X + Y makes a new sequence object with the contents of both operands.
X * N makes a new sequence object with N copies of the sequence operand X.

In other words, these operations work the same on any kind of sequence—strings, lists, tuples, and some user-defined object types. The only difference is that you get back a new result object that is the same type as the operands X and Y—if you concatenate lists, you get back a new list, not a string. Indexing, slicing, and other sequence operations work the same on all sequences too; the type of the objects being processed tells Python which task to perform.

The immutable classification is an important constraint to know yet it tends to trip up new users. If an object type is immutable, you cannot change its value in-place; Python raises an error if you try. Instead, run code to make a new object for a new value. Generally, immutable types give some degree of integrity, by guaranteeing that an object won't be changed by another part of a program. You'll see why this matters when shared object references are discussed in Chapter 7.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 6: Lists and Dictionaries

Content preview·Buy reprint rights for this chapter

This chapter presents the list and dictionary object types—collections of other objects, which are the main workhorses in almost all Python scripts. As we'll see, both of these types are remarkably flexible: they can be changed, can grow and shrink on demand, and may contain and be nested in any other kind of object. By leveraging these types, we can build up and process arbitrarily rich information structures in our scripts.

The next stop on the built-in object tour is the Python list. Lists are Python's most flexible ordered collection object type. Unlike strings, lists can contain any sort of object: numbers, strings, even other lists. Python lists do the work of most of the collection data structures you might have to implement manually in lower-level languages such as C. In terms of some of their main properties, Python lists are:

Ordered collections of arbitrary objects: From a functional view, lists are just a place to collect other objects, so you can treat them as a group. Lists also define a left-to-right positional ordering of the items in the list.
Accessed by offset: Just as with strings, you can fetch a component object out of a list by indexing the list on the object's offset. Since items in lists are ordered by their positions, you can also do such tasks as slicing and concatenation.
Variable length, heterogeneous, arbitrarily nestable: Unlike strings, lists can grow and shrink in place (they can have variable length), and may contain any sort of object, not just one-character strings (they're heterogeneous). Because lists can contain other complex objects, lists also support arbitrary nesting; you can create lists of lists of lists.
Of the category mutable sequence: In terms of our type category qualifiers, lists can be both changed in place (they're mutable) and respond to all the sequence operations used with strings like indexing, slicing, and concatenation. In fact, sequence operations work the same on lists. Because lists are mutable, they also support other operations strings don't, such as deletion and index assignment.
Arrays of object references

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Lists

Content preview·Buy reprint rights for this chapter

Ordered collections of arbitrary objects: From a functional view, lists are just a place to collect other objects, so you can treat them as a group. Lists also define a left-to-right positional ordering of the items in the list.
Accessed by offset: Just as with strings, you can fetch a component object out of a list by indexing the list on the object's offset. Since items in lists are ordered by their positions, you can also do such tasks as slicing and concatenation.
Variable length, heterogeneous, arbitrarily nestable: Unlike strings, lists can grow and shrink in place (they can have variable length), and may contain any sort of object, not just one-character strings (they're heterogeneous). Because lists can contain other complex objects, lists also support arbitrary nesting; you can create lists of lists of lists.
Of the category mutable sequence: In terms of our type category qualifiers, lists can be both changed in place (they're mutable) and respond to all the sequence operations used with strings like indexing, slicing, and concatenation. In fact, sequence operations work the same on lists. Because lists are mutable, they also support other operations strings don't, such as deletion and index assignment.
Arrays of object references: Technically, Python lists contain zero or more references to other objects. Lists might remind you of arrays of pointers (addresses). Fetching an item from a Python list is about as fast as indexing a C array; in fact, lists really are C arrays inside the Python interpreter. Python always follows a reference to an object whenever the reference is used, so your program only deals with objects. Whenever you insert an object into a data structure or variable name, Python always stores a reference to the object, not a copy of it (unless you request a copy explicitly).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Lists in Action

Content preview·Buy reprint rights for this chapter

Perhaps the best way to understand lists is to see them at work. Let's once again turn to some simple interpreter interactions to illustrate the operations in Table 6-1.

Lists respond to the + and * operators much like strings; they mean concatenation and repetition here too, except that the result is a new list, not a string. In fact, lists respond to all of the general sequence operations used for strings.

% python
>>> len([1, 2, 3])                    # Length
3
>>> [1, 2, 3] + [4, 5, 6]             # Concatenation
[1, 2, 3, 4, 5, 6]
>>> ['Ni!'] * 4                       # Repetition
['Ni!', 'Ni!', 'Ni!', 'Ni!']
>>> 3 in [1, 2, 3]                    # Membership (1 means true)
1
>>> for x in [1, 2, 3]: print x,      # Iteration
...
1 2 3

We talk more about for iteration and the range built-ins in Chapter 10, because they are related to statement syntax; in short, for loops step through items in a sequence. The last entry in Table 6-1, list comprehensions, are covered in Chapter 14; they are a way to build lists by applying expressions to sequences, in a single step.

Although + works the same for lists and strings, it's important to know that it expects the same sort of sequence on both sides—otherwise you get a type error when the code runs. For instance, you cannot concatenate a list and a string, unless you first convert the list to a string using backquotes, str, or % formatting. You could also convert the string to a list; the list built-in function does the trick:

>>> `[1, 2]` + "34"         # Same as "[1, 2]" + "34"
'[1, 2]34'
>>> [1, 2] + list("34")     # Same as [1, 2] + ["3", "4"]
[1, 2, '3', '4']

Because lists are sequences, indexing and slicing work the same way, but the result of indexing a list is whatever type of object lives at the offset you specify, and slicing a list always returns a new list:

>>> L = ['spam', 'Spam', 'SPAM!']
>>> L[2]                               # Offsets start at zero.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Dictionaries

Content preview·Buy reprint rights for this chapter

Besides lists, dictionaries are perhaps the most flexible built-in data type in Python. If you think of lists as ordered collections of objects, dictionaries are unordered collections; their chief distinction is that items are stored and fetched in dictionaries by key, instead of positional offset.

Being a built-in type, dictionaries can replace many of the searching algorithms and data structures you might have to implement manually in lower-level languages—indexing a dictionary is a very fast search operation. Dictionaries also sometimes do the work of records and symbol tables used in other languages, can represent sparse (mostly empty) data structures, and much more. In terms of their main properties, dictionaries are:

Accessed by key, not offset: Dictionaries are sometimes called associative arrays or hashes. They associate a set of values with keys, so that you can fetch an item out of a dictionary using the key that stores it. You use the same indexing operation to get components in a dictionary, but the index takes the form of a key, not a relative offset.
Unordered collections of arbitrary objects: Unlike lists, items stored in a dictionary aren't kept in any particular order; in fact, Python randomizes their order in order to provide quick lookup. Keys provide the symbolic (not physical) location of items in a dictionary.
Variable length, heterogeneous, arbitrarily nestable: Like lists, dictionaries can grow and shrink in place (without making a copy), they can contain objects of any type, and support nesting to any depth (they can contain lists, other dictionaries, and so on).
Of the category mutable mapping: Dictionaries can be changed in place by assigning to indexes, but don't support the sequence operations that work on strings and lists. Because dictionaries are unordered collections, operations that depend on a fixed order (e.g., concatenation, slicing) don't make sense. Instead, dictionaries are the only built-in representative of the mapping type category—objects that map keys to values.
Tables of object references (hash tables): If lists are arrays of object references, dictionaries are unordered tables of object references. Internally, dictionaries are implemented as hash tables (data structures that support very fast retrieval), which start small and grow on demand. Moreover, Python employs optimized hashing algorithms to find keys, so retrieval is very fast. Dictionaries store object references (not copies), just like lists.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Dictionaries in Action

Content preview·Buy reprint rights for this chapter

As Table 6-2 suggests, dictionaries are indexed by key, and nested dictionary entries are referenced by a series of indexes (keys in square brackets). When Python creates a dictionary, it stores its items in any order it chooses; to fetch a value back, supply the key that it is associated with. Let's go back to the interpreter to get a feel for some of the dictionary operations in Table 6-2.

In normal operation, you create dictionaries and store and access items by key:

% python
>>> d2 = {'spam': 2, 'ham': 1, 'eggs': 3}    # Make a dictionary.
>>> d2['spam']                               # Fetch value by key.
2
>>> d2                                       # Order is scrambled.
{'eggs': 3, 'ham': 1, 'spam': 2}

Here, the dictionary is assigned to variable d2; the value of the key 'spam' is the integer 2. We use the same square bracket syntax to index dictionaries by key as we did to index lists by offsets, but here it means access by key, not position.

Notice the end of this example: the order of keys in a dictionary will almost always be different than what you originally typed. This is on purpose—to implement fast key lookup (a.k.a. hashing), keys need to be randomized in memory. That's why operations that assume a left-to-right order do not apply to dictionaries (e.g., slicing, concatenation); you can only fetch values by key, not position.

The built-in len function works on dictionaries too; it returns the number of items stored away in the dictionary, or equivalently, the length of its keys list. The dictionary has_key method allows you to test for key existence, and the keys method returns all the keys in the dictionary, collected in a list. The latter of these can be useful for processing dictionaries sequentially, but you shouldn't depend on the order of the keys list. Because the keys result is a normal list, however, it can always be sorted if order matters:

>>> len(d2)                    # Number of entries in dictionary
3
>>> d2.has_key('ham')          # Key membership test (1 means true)
1
>>> 'ham' in d3                # Key membership test alternative

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 7: Tuples, Files, and Everything Else

Content preview·Buy reprint rights for this chapter

This chapter rounds out our look at the core object types in Python, by introducing the tuple (a collection of other objects that cannot be changed), and the file (an interface to external files on your computer). As you'll see, the tuple is a relatively simple object that largely performs operations you've already learned about for strings and lists. The file object is a commonly-used and full-featured tool for processing files; further file examples appear in later chapters of this book.

This chapter also concludes this part of the book by looking at properties common to all the core datatypes we've met—the notions of equality, comparisons, object copies, and so on. We'll also briefly explore other object types in the Python toolbox; as we'll see, although we've met all the primary built-in types, the object story in Python is broader than we've implied thus far. Finally, we'll close this part with a set of common datatype pitfalls, and exercises that will allow you to experiment with the ideas you've learned.

The last collection type in our survey is the Python tuple. Tuples construct simple groups of objects. They work exactly like lists, except that tuples can't be changed in-place (they're immutable) and are usually written as a series of items in parentheses, not square brackets. Although they don't support any method calls, tuples share most of their properties with lists. Tuples are:

Ordered collections of arbitrary objects: Like strings and lists, tuples are a positionally-ordered collection of objects; like lists, they can embed any kind of object.
Accessed by offset: Like strings and lists, items in a tuple are accessed by offset (not key); they support all the offset-based access operations, such as indexing and slicing.
Of the category immutable sequence: Like strings, tuples are immutable; they don't support any of the in-place change operations applied to lists. Like strings and lists, tuples are sequences; they support many of the same operations.
Fixed length, heterogeneous, arbitrarily nestable: Because tuples are immutable, they cannot grow or shrink without making a new tuple; on the other hand, tuples can hold other compound objects (e.g., lists, dictionaries, other tuples) and so support arbitrary nesting.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Tuples

Content preview·Buy reprint rights for this chapter

Ordered collections of arbitrary objects: Like strings and lists, tuples are a positionally-ordered collection of objects; like lists, they can embed any kind of object.
Accessed by offset: Like strings and lists, items in a tuple are accessed by offset (not key); they support all the offset-based access operations, such as indexing and slicing.
Of the category immutable sequence: Like strings, tuples are immutable; they don't support any of the in-place change operations applied to lists. Like strings and lists, tuples are sequences; they support many of the same operations.
Fixed length, heterogeneous, arbitrarily nestable: Because tuples are immutable, they cannot grow or shrink without making a new tuple; on the other hand, tuples can hold other compound objects (e.g., lists, dictionaries, other tuples) and so support arbitrary nesting.
Arrays of object references: Like lists, tuples are best thought of as object reference arrays; tuples store access points to other objects (references), and indexing a tuple is relatively quick.

Table 7-1 highlights common tuple operations. Tuples are written as a series of objects (really, expressions that generate objects), separated by commas, and enclosed in parentheses. An empty tuple is just a parentheses pair with nothing inside.

Table 7-1: Common tuple literals and operations
Operation	Interpretation
`( )`	An empty tuple

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Files

Content preview·Buy reprint rights for this chapter

Most readers are probably familiar with the notion of files—named storage compartments on your computer that are managed by your operating system. This last built-in object type provides a way to access those files inside Python programs. The built-in open function creates a Python file object, which serves as a link to a file residing on your machine. After calling open, you can read and write the associated external file, by calling file object methods. The built-in name file is a synonym for open, and files may be opened by calling either name.

Compared to the types you've seen so far, file objects are somewhat unusual. They're not numbers, sequences, or mappings; instead, they export methods only for common file processing tasks.

Table 7-2 summarizes common file operations. To open a file, a program calls the open function, with the external name first, followed by a processing mode ('r' to open for input—the default; 'w' to create and open for output; 'a' to open for appending to the end; and others we'll omit here). Both arguments must be Python strings. The external file name argument may include a platform-specific and absolute or relative directory path prefix; without a path, the file is assumed to exist in the current working directory (i.e., where the script runs).

Table 7-2: Common file operations
Operation	Interpretation
`output = open('/tmp/spam', 'w')`	Create output file ('`w`' means write).
`input = open('data', 'r')`	Create input file ('`r`' means read).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Type Categories Revisited

Content preview·Buy reprint rights for this chapter

Now that we've seen all of Python's core built-in types, let's take a look at some of the properties they share.

Table 7-3 classifies all the types we've seen, according to the type categories we introduced earlier. Objects share operations according to their category—for instance, strings, lists, and tuples all share sequence operations. Only mutable objects may be changed in-place. You can change lists and dictionaries in-place, but not numbers, strings, or tuples. Files only export methods, so mutability doesn't really apply (they may be changed when written, but this isn't the same as Python type constraints).

Table 7-3: Object classifications
Object type	Category	Mutable?
Numbers	Numeric	No
Strings	Sequence	No
Lists	Sequence	Yes
Dictionaries	Mapping	Yes
Tuples	Sequence	No
Files	Extension	n/a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Object Generality

Content preview·Buy reprint rights for this chapter

We've seen a number of compound object types (collections with components). In general:

Lists, dictionaries, and tuples can hold any kind of object.
Lists, dictionaries, and tuples can be arbitrarily nested.
Lists and dictionaries can dynamically grow and shrink.

Because they support arbitrary structures, Python's compound object types are good at representing complex information in a program. For example, values in dictionaries may be lists, which may contain tuples, which may contain dictionaries, and so on—as deeply nested as needed to model the data to be processed.

Here's an example of nesting. The following interaction defines a tree of nested compound sequence objects, shown in Figure 7-1. To access its components, you may include as many index operations as required. Python evaluates the indexes from left to right, and fetches a reference to a more deeply nested object at each step. Figure 7-1 may be a pathologically complicated data structure, but it illustrates the syntax used to access nested objects in general:

>>> L = ['abc', [(1, 2), ([3], 4)], 5]
>>> L[1]
[(1, 2), ([3], 4)]
>>> L[1][1]
([3], 4)
>>> L[1][1][0]
[3]
>>> L[1][1][0][0]
3

Figure 7-1: A nested object tree

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

References Versus Copies

Content preview·Buy reprint rights for this chapter

Section 4.6 in Chapter 4 mentioned that assignments always store references to objects, not copies. In practice, this is usually what you want. But because assignments can generate multiple references to the same object, you sometimes need to be aware that changing a mutable object in-place may affect other references to the same object elsewhere in your program. If you don't want such behavior, you'll need to tell Python to copy the object explicitly.

For instance, the following example creates a list assigned to X, and another assigned to L that embeds a reference back to list X. It also creates a dictionary D that contains another reference back to list X:

>>> X = [1, 2, 3]
>>> L = ['a', X, 'b']           # Embed references to X's object.
>>> D = {'x':X, 'y':2}

At this point, there are three references to the first list created: from name X, from inside the list assigned to L, and from inside the dictionary assigned to D. The situation is illustrated in Figure 7-2.

Figure 7-2: Shared object references

Since lists are mutable, changing the shared list object from any of the three references changes what the other two reference:

>>> X[1] = 'surprise'         # Changes all three references!
>>> L
['a', [1, 'surprise', 3], 'b']
>>> D
{'x': [1, 'surprise', 3], 'y': 2}

References are a higher-level analog of pointers in other languages. Although you can't grab hold of the reference itself, it's possible to store the same reference in more than one place: variables, lists, and so on. This is a feature—you can pass a large object around a program without generating copies of it along the way. If you really do want copies, you can request them:

Slice expressions with empty limits copy sequences.
The dictionary copy method copies a dictionary.
Some built-in functions such as list also make copies.
The copy standard library module makes full copies.

For example, if you have a list and a dictionary, and don't want their values to be changed through other variables:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Comparisons, Equality, and Truth

Content preview·Buy reprint rights for this chapter

All Python objects also respond to the comparisons: test for equality, relative magnitude, and so on. Python comparisons always inspect all parts of compound objects, until a result can be determined. In fact, when nested objects are present, Python automatically traverses data structures to apply comparisons recursively—left to right, and as deep as needed.

For instance, a comparison of list objects compares all their components automatically:

>>> L1 = [1, ('a', 3)]         # Same value, unique objects
>>> L2 = [1, ('a', 3)]
>>> L1 == L2, L1 is L2         # Equivalent? Same object?
(1, 0)

Here, L1 and L2 are assigned lists that are equivalent, but distinct objects. Because of the nature of Python references (studied in Chapter 4), there are two ways to test for equality:

The == operator tests value equivalence. Python performs an equivalence test, comparing all nested objects recursively
The is operator tests object identity. Python tests whether the two are really the same object (i.e., live at the same address in memory).

In the example, L1 and L2 pass the == test (they have equivalent values because all their components are equivalent), but fail the is check (they are two different objects, and hence two different pieces of memory). Notice what happens for short strings:

>>> S1 = 'spam'
>>> S2 = 'spam'
>>> S1 == S2, S1 is S2
(1, 1)

Here, we should have two distinct objects that happen to have the same value: == should be true, and is should be false. Because Python internally caches and reuses short strings as an optimization, there really is just a single string, 'spam', in memory, shared by S1 and S2; hence, the is identity test reports a true result. To trigger the normal behavior, we need to use longer strings that fall outside the cache mechanism:

>>> S1 = 'a longer string'
>>> S2 = 'a longer string'
>>> S1 == S2, S1 is S2
(1, 0)

Because strings are immutable, the object caching mechanism is irrelevent to your code—string can't be changed in-place, regardless of how many variables refer to them. If identity tests seem confusing, see Section 4.6 in Chapter 4 for a refresher on object reference concepts.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python's Type Hierarchies

Content preview·Buy reprint rights for this chapter

Figure 7-3 summarizes all the built-in object types available in Python and their relationships. We've looked at the most prominent of these; most other kinds of objects in Figure 7-3 either correspond to program units (e.g., functions and modules), or exposed interpreter internals (e.g., stack frames and compiled code).

Figure 7-3: Built-in type hierarchies

The main point to notice here is that everything is an object type in a Python system and may be processed by your Python programs. For instance, you can pass a class to a function, assign it to a variable, stuff it in a list or dictionary, and so on.

Even types are an object type in Python: a call to the built-in function type(X) returns the type object of object X. Type objects can be used for manual type comparisons in Python if statements. However, for reasons to be explained in Part IV, manual type testing is usually not the right thing to do in Python.

A note on type names: as of Python 2.2, each core type has a new built-in name added to support type subclassing: dict, list, str, tuple, int, long, float, complex, unicode, type, and file (file is a synonym for open). Calls to these names are really object constructor calls, not simply conversion functions.

The types module provides additional type names (now largely synonyms for the built-in type names), and it is possible to do type tests with the isinstance function. For example, in Version 2.2, all of the following type tests are true:

isinstance([1],list)
type([1])==list
type([1])==type([  ])
type([1])==types.ListType

Because types can be subclassed in 2.2, the isinstance technique is generally recommended. See Chapter 23 for more on subclassing built-in types in 2.2 and later.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Other Types in Python

Content preview·Buy reprint rights for this chapter

Besides the core objects studied in this chapter, a typical Python installation has dozens of other object types available as linked-in C extensions or Python classes. You'll see examples of a few later in the book—regular expression objects, DBM files, GUI widgets, and so on. The main difference between these extra tools and the built-in types just seen is that the built-ins provide special language creation syntax for their objects (e.g., 4 for an integer, [1,2] for a list, the open function for files). Other tools are generally exported in a built-in module that you must first import to use. See Python's library reference for a comprehensive guide to all the tools available to Python programs.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Built-in Type Gotchas

Content preview·Buy reprint rights for this chapter

Part II concludes with a discussion of common problems that seem to bite new users (and the occasional expert), along with their solutions.

Because this is such a central concept, it is mentioned again: you need to understand what's going on with shared references in your program. For instance, in the following exmaple, the list object assigned to name L is referenced both from L and from inside the list assigned to name M. Changing L in-place changes what M references too:

>>> L = [1, 2, 3]
>>> M = ['X', L, 'Y']       # Embed a reference to L.
>>> M
['X', [1, 2, 3], 'Y']
>>> L[1] = 0                # Changes M too
>>> M
['X', [1, 0, 3], 'Y']

This effect usually becomes important only in larger programs, and shared references are often exactly what you want. If they're not, you can avoid sharing objects by copying them explicitly; for lists, you can always make a top-level copy by using an empty-limits slice:

>>> L = [1, 2, 3]
>>> M = ['X', L[:], 'Y']       # Embed a copy of L.
>>> L[1] = 0                   # Changes only L, not M 
>>> L
[1, 0, 3]
>>> M
['X', [1, 2, 3], 'Y']

Remember, slice limits default to 0 and the length of the sequence being sliced; if both are omitted, the slice extracts every item in the sequence, and so makes a top-level copy (a new, unshared object).

Sequence repetition is like adding a sequence to itself a number of times. That's true, but when mutable sequences are nested, the effect might not always be what you expect. For instance, in the following, X is assigned to L repeated four times, whereas Y is assigned to a list containing L repeated four times:

>>> L = [4, 5, 6]
>>> X = L * 4           # Like [4, 5, 6] + [4, 5, 6] + ...
>>> Y = [L] * 4         # [L] + [L] + ... = [L, L,...]
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]]

Because L was nested in the second repetition, Y winds up embedding references back to the original list assigned to L, and is open to the same sorts of side effects noted in the last section:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part II Exercises

Content preview·Buy reprint rights for this chapter

This session asks you to get your feet wet with built-in object fundamentals. As before, a few new ideas may pop up along the way, so be sure to flip to Section B.2 when you're done (and even when you're not). If you have limited time, we suggest starting with exercise 11 (the most practical of the bunch), and then working from first to last as time allows. This is all fundamental material, though, so try to do as many of these as you can.

The basics. Experiment interactively with the common type operations found in the tables in Part II. To get started, bring up the Python interactive interpreter, type each of the expressions below, and try to explain what's happening in each case:

2 ** 16
2 / 5, 2 / 5.0
"spam" + "eggs"
S = "ham"
"eggs " + S
S * 5
S[:0]
"green %s and %s" % ("eggs", S)
('x',)[0]
('x', 'y')[1]
L = [1,2,3] + [4,5,6]
L, L[:], L[:0], L[-2], L[-2:]
([1,2,3] + [4,5,6])[2:4]
[L[2], L[3]]
L.reverse(  ); L
L.sort(  ); L
L.index(4)
{'a':1, 'b':2}['b']
D = {'x':1, 'y':2, 'z':3}
D['w'] = 0
D['x'] + D['w']
D[(1,2,3)] = 4
D.keys(  ), D.values(  ), D.has_key((1,2,3))
[[  ]], ["carview.php?tsp=",[  ],(  ),{  },None]

Indexing and slicing. At the interactive prompt, define a list named L that contains four strings or numbers (e.g., L=[0,1,2,3]). Then, experiment with some boundary cases.
1. What happens when you try to index out of bounds (e.g., L[4])?
2. What about slicing out of bounds (e.g., L[-1000:100])?
3. Finally, how does Python handle it if you try to extract a sequence in reverse—with the lower bound greater than the higher bound (e.g., L[3:1])? Hint: try assigning to this slice (L[3:1]=['?']) and see where the value is put. Do you think this may be the same phenomenon you saw when slicing out of bounds?
Indexing, slicing, and del. Define another list L with four items again, and assign an empty list to one of its offsets (e.g., L[2]=[ ]). What happens? Then assign an empty list to a slice (L[2:3]=[ ]). What happens now? Recall that slice assignment deletes the slice and inserts the new value where it used to be. The del statement deletes offsets, keys, attributes, and names. Use it on your list to delete an item (e.g.,

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 8: Assignment, Expressions, and Print

Content preview·Buy reprint rights for this chapter

Now that we've seen Python's core built-in object types, this chapter explores its fundamental statement forms. In simple terms, statements are the things you write to tell Python what your programs should do. If programs do things with stuff, statements are the way you specify what sort of things a program does. Python is a procedural, statement-based language; by combining statements, you specify a procedure that Python performs to satisfy a program's goals.

Another way to understand the role of statements is to revisit the concept hierarchy introduced in Chapter 4, which talked about built-in objects and the expressions used to manipulate them. This chapter climbs the hierarchy to the next level:

Programs are composed of modules.
Modules contain statements.
Statements contain expressions.
Expressions create and process objects.

At its core, Python syntax is composed of statements and expressions. Expressions process objects, and are embedded in statements. Statements code the larger logic of a program's operation—they use and direct expressions to process the objects we've already seen. Moreover, statements are where objects spring into existence (e.g., in expressions within assignment statements), and some statements create entirely new kinds of objects (functions, classes, and so on). Statements always exist in modules, which themselves are managed with statements.

Table 8-1 summarizes Python's statement set. Part III deals with entries in the table through break and continue. You've informally been introduced to a few of the statements in Table 8-1. Part III will fill in details that were skipped earlier, introduce the rest of Python's procedural statement set, and cover the overall syntax model.

Table 8-1: Python statements
Statement	Role	Example
Assignment	Creating references

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Assignment Statements

Content preview·Buy reprint rights for this chapter

We've been using the Python assignment statement already, to assign objects to names. In its basic form, you write a target of an assignment on the left of an equals sign and an object to be assigned on the right. The target on the left may be a name or object component, and the object on the right can be an arbitrary expression that computes an object. For the most part, assignment is straightforward to use, but here are a few properties to keep in mind:

Python assignment stores references to objects in names or data structure slots. It always creates references to objects, instead of copying objects. Because of that, Python variables are much more like pointers than data storage areas.
Names are created when first assigned. Python creates variable names the first time you assign them a value (an object reference). There's no need to predeclare names ahead of time. Some (but not all) data structure slots are created when assigned too (e.g., dictionary entries, some object attributes). Once assigned, a name is replaced by the value it references whenever it appears in an expression.
Names must be assigned before being referenced. Conversely, it's an error to use a name you haven't assigned a value to yet. Python raises an exception if you try, rather than returning some sort of ambiguous (and hard to notice) default value.
Implicit assignments: import, from, def, class, for, function arguments. In this section, we're concerned with the = statement, but assignment occurs in many contexts in Python. For instance, we'll see later that module imports, function and class definitions, for loop variables, and function arguments, are all implicit assignments. Since assignment works the same everywhere it pops up, all these contexts simply bind names to object references at runtime.

Table 8-2 illustrates the different assignment statements in Python. In addition to this table, Python includes a set of assignment statement forms known as augmented assignment

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Expression Statements

Content preview·Buy reprint rights for this chapter

In Python, you can use expressions as statements too. But since the result of the expression won't be saved, it makes sense to do so only if the expression does something useful as a side effect. Expressions are commonly used as statements in two situations:

For calls to functions and methods: Some functions and methods do lots of work without returning a value. Since you're not interested in retaining the value they return, you can call such functions with an expression statement. Such functions are sometimes called procedures in other languages; in Python, they take the form of functions that don't return a value.
For printing values at the interactive prompt: Python echoes back the results of expressions typed at the interactive command line. Technically, these are expression statements too; they serve as a shorthand for typing print statements.

Table 8-5 lists some common expression statement forms in Python. Calls to functions and methods are coded with zero or more argument objects (really, expressions that evaluate to objects) in parentheses, after the function or method.

Table 8-5: Common Python expression statements
Operation	Interpretation
`spam(eggs, ham)`	Function calls
`spam.ham(eggs)`	Method calls
`Spam`	Interactive print
`spam < ham and ham != eggs`	Compound expressions

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Print Statements

Content preview·Buy reprint rights for this chapter

The print statement simply prints objects. Technically, it writes the textual representation of objects to the standard output stream. The standard output stream is the same as the C language's stdout; it is usually mapped to the window where you started your Python program (unless redirected to a file in your system's shell).

In Chapter 7, we also saw file methods that write text. The print statement is similar, but more focused: print writes objects to the stdout stream (with some default formatting), but file write methods write strings to files. Since the standard output stream is available in Python as the stdout object in the built-in sys module (i.e., sys.stdout), it's possible to emulate print with file writes, but print is easier to use.

Table 8-6 lists the print statement's forms. We've seen the basic print statement in action already. By default, it adds a space between the items separated by commas, and adds a linefeed at the end of the current output line:

>>> x = 'a'
>>> y = 'b'
>>> print x, y
a b

Table 8-6: Print statement forms
Operation	Interpretation
`print spam, ham`	Print objects to `sys.stdout`; add a space between.
`print spam, ham`,	Same, but don't add newline at end of text.
`print >> myfile, spam, ham`	Send text to `myfile.write`, not to `sys.stdout.write`.

This formatting is just a default; you can choose to use it or not. To suppress the linefeed (so you can add more text to the current line later), end your

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 9: if Tests

Content preview·Buy reprint rights for this chapter

This chapter presents the Python if statement—the main statement used for selecting from alternative actions based on test results. Because this is our first exposure to compound statements—statements which embed other statements—we will also explore the general concepts behind the Python statement syntax model here. And because the if statement introduces the notion of tests, we'll also use this chapter to study the concepts of truth tests and Boolean expressions in general.

In simple terms, the Python if statement selects actions to perform. It's the primary selection tool in Python and represents much of the logic a Python program possesses. It's also our first compound statement; like all compound Python statements, the if may contain other statements, including other ifs. In fact, Python lets you combine statements in a program both sequentially (so that they execute one after another), and arbitrarily nested (so that they execute only under certain conditions).

The Python if statement is typical of most procedural languages. It takes the form of an if test, followed by one or more optional elif tests (meaning "else if"), and ends with an optional else block. Each test and the else have an associated block of nested statements indented under a header line. When the statement runs, Python executes the block of code associated with the first test that evaluates to true, or the else block if all tests prove false. The general form of an if looks like this:

if <test1>:               # if test 
    <statements1>         # Associated block
elif <test2>:             # Optional elifs 
    <statements2>
else:                     # Optional else
    <statements3>

Let's look at a few simple examples of the if statement at work. All parts are optional except the initial if test and its associated statements; in the simplest case, the other parts are omitted:

>>> if 1:
...     print 'true'
...
true

Notice how the prompt changes to " . . . " for continuation lines in the basic interface used here (in IDLE, you'll simply drop down to an indented line instead—hit Backspace to back up); a blank line terminates and runs the entire statement. Remember that 1 is Boolean true, so this statement's test always succeeds; to handle a false result, code the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

if Statements

Content preview·Buy reprint rights for this chapter

if <test1>:               # if test 
    <statements1>         # Associated block
elif <test2>:             # Optional elifs 
    <statements2>
else:                     # Optional else
    <statements3>

>>> if 1:
...     print 'true'
...
true

                  >>> if not 1:
                  ...     print 'true'
                  ... else:
                  ...     print 'false'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python Syntax Rules

Content preview·Buy reprint rights for this chapter

In general, Python has a simple, statement-based syntax. But there are a few properties you need to know:

Statements execute one after another, until you say otherwise. Python normally runs statements in a file or nested block from first to last, but statements like the if (and, as you'll see, loops) cause the interpreter to jump around in your code. Because Python's path through a program is called the control flow, things like the if that affect it are called control-flow statements.
Block and statement boundaries are detected automatically. There are no braces or "begin/end" delimiters around blocks of code; instead, Python uses the indentation of statements under a header to group the statements in a nested block. Similarly, Python statements are not normally terminated with a semicolon; rather, the end of a line usually marks the end of the statements coded on that line.
Compound statements = header, ":", indented statements. All compound statements in Python follow the same pattern: a header line terminated with a colon, followed by one or more nested statements usually indented under the header. The indented statements are called a block (or sometimes, a suite). In the if statement, the elif and else clauses are part of the if, but are header lines with nested blocks of their own.
Blank lines, s paces, and comments are usually ignored. Blank lines are ignored in files (but not at the interactive prompt). Spaces inside statements and expressions are almost always ignored (except in string literals and indentation). Comments are always ignored: they start with a # character (not inside a string literal) and extend to the end of the current line.
Docstrings are ignored but saved, and displayed by tools. Python supports an additional comment form called documentation strings (docstrings for short), which, unlike

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Truth Tests

Content preview·Buy reprint rights for this chapter

We introduced the notions of comparison, equality, and truth values in Chapter 7. Since if statements are the first statement that actually uses test results, we'll expand on some of these ideas here. In particular, Python's Boolean operators are a bit different from their counterparts in languages like C. In Python:

True means any nonzero number or nonempty object.
False means not true: a zero number, empty object, or None.
Comparisons and equality tests are applied recursively to data structures.
Comparisons and equality tests return 1 or 0 (true or false).
Boolean and and or operators return a true or false operand object.

In short, Boolean operators are used to combine the results of other tests. There are three Boolean expression operators in Python:

X and Y: Is true if both X and Y are true
X or Y: Is true if either X or Y are true
not X: Is true if X is false (the expression returns 1 or 0)

Here, X and Y may be any truth value or an expression that returns a truth value (e.g., an equality test, range comparison, and so on). Boolean operators are typed out as words in Python (instead of C's &&, ||, and !). Boolean and and or operators return a true or false object in Python, not an integer 1 or 0. Let's look at a few examples to see how this works:

>>> 2 < 3, 3 < 2        # Less-than: return 1 or 0 
(1, 0)

Magnitude comparisons like these return an integer 1 or 0 as their truth value result. But and and or operators always return an object instead. For or tests, Python evaluates the operand objects from left to right, and returns the first one that is true. Moreover, Python stops at the first true operand it finds; this is usually called short-circuit evaluation, since determining a result short-circuits (terminates) the rest of the expression:

>>> 2 or 3, 3 or 2      # Return left operand if true.
(2, 3)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 10: while and for Loops

Content preview·Buy reprint rights for this chapter

In this chapter, we meet Python's two main looping constructs—statements that repeat an action over and over. The first of these, the while loop, provides a way to code general loops; the second, the for statement, is designed for stepping through the items in a sequence object and running a block of code for each item.

There are other kinds of looping operations in Python, but the two statements covered here are the primary syntax provided for coding repeated actions. We'll also study a few unusual statements such as break and continue here, because they are used within loops.

Python's while statement is its most general iteration construct. In simple terms, it repeatedly executes a block of indented statements, as long as a test at the top keeps evaluating to a true value. When the test becomes false, control continues after all the statements in the while block; the body never runs if the test is false to begin with.

The while statement is one of two looping statements (along with the for). It is called a loop because control keeps looping back to the start of the statement, until the test becomes false. The net effect is that the loop's body is executed repeatedly while the test at the top is true. Besides statements, Python also provides a handful of tools that implicitly loop (iterate): the map, reduce, and filter functions; the in membership test; list comprehensions; and more. We'll explore most of these in Chapter 14.

In its most complex form, the while statement consists of a header line with a test expression, a body of one or more indented statements, and an optional else part that is executed if control exits the loop without running into a break statement. Python keeps evaluating the test at the top, and executing the statements nested in the while part, until the test returns a false value:

while <test>:             # Loop test
    <statements1>         # Loop body
else:                     # Optional else
    <statements2>         # Run if didn't exit loop with break

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

while Loops

Content preview·Buy reprint rights for this chapter

while <test>:             # Loop test
    <statements1>         # Loop body
else:                     # Optional else
    <statements2>         # Run if didn't exit loop with break

To illustrate, here are a few of simple while loops in action. The first just prints a message forever, by nesting a print statement in a while loop. Recall that an integer 1 means true; since the test is always true, Python keeps executing the body forever or until you stop its execution. This sort of behavior is usually called an infinite loop:

>>> while 1:
...    print 'Type Ctrl-C to stop me!'

The next example keeps slicing off the first character of a string, until the string is empty and hence false. It's typical to test an object directly like this, instead of using the more verbose equivalent:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

break, continue, pass, and the Loop else

Content preview·Buy reprint rights for this chapter

Now that we've seen our first Python loop, we should introduce two simple statements that have a purpose only when nested inside loops—the break and continue statements. We will also study the loop else clause here because it is intertwined with break, and Python's empty placeholder statement, the pass. In Python:

break: Jumps out of the closest enclosing loop (past the entire loop statement)
continue: Jumps to the top of the closest enclosing loop (to the loop's header line)
pass: Does nothing at all: it's an empty statement placeholder
Loop else block: Runs if and only if the loop is exited normally—without hitting a break

Factoring in break and continue statements, the general format of the while loop looks like this:

while <test1>:
    <statements1>
    if <test2>: break         # Exit loop now, skip else.
    if <test3>: continue      # Go to top of loop now.
else:
    <statements2>             # If we didn't hit a 'break'

break and continue statements can appear anywhere inside the while (and for) loop's body, but they are usually coded further nested in an if test, to take action in response to some sort of condition.

Let's turn to a few simple examples to see how these statements come together in practice. The pass statement is used when the syntax requires a statement, but you have nothing useful to say. It is often used to code an empty body for a compound statement. For instance, if you want to code an infinite loop that does nothing each time through, do it with a pass:

while 1: pass   # Type Ctrl-C to stop me!

Since the body is just an empty statement, Python gets stuck in this loop. pass is roughly to statements as None is to objects—an explicit nothing. Notice that the while

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

for Loops

Content preview·Buy reprint rights for this chapter

The for loop is a generic sequence iterator in Python: it can step through the items in any ordered sequence object. The for works on strings, lists, tuples, and new objects we'll create later with classes.

The Python for loop begins with a header line that specifies an assignment target (or targets), along with an object you want to step through. The header is followed by a block of indented statements, which you want to repeat:

for <target> in <object>:   # Assign object items to target.
    <statements>            # Repeated loop body: use target
else:
    <statements>            # If we didn't hit a 'break'

When Python runs a for loop, it assigns items in the sequence object to the target, one by one, and executes the loop body for each. The loop body typically uses the assignment target to refer to the current item in the sequence, as though it were a cursor stepping through the sequence.

The name used as the assignment target in a for header line is usually a (possibly new) variable in the scope where the for statement is coded. There's not much special about it; it can even be changed inside the loop's body, but will be automatically set to the next item in the sequence when control returns to the top of the loop again. After the loop, this variable normally still refers to the last item visited, which is the last item in the sequence unless the loop exits with a break statement.

The for also supports an optional else block, which works exactly as it does in while loops; it's executed if the loop exits without running into a break statement (i.e., if all items in the sequence were visited). In fact, the break and continue statements introduced above work the same in the for loop as they do in the while. The for loop's complete format can be described this way:

for <target> in <object>:   # Assign object items to target.
    <statements>
    if <test>: break        # Exit loop now, skip else.
    if <test>: continue     # Go to top of loop now.
else:
    <statements>            # If we didn't hit a 'break'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Loop Variations

Content preview·Buy reprint rights for this chapter

The for loop subsumes most counter-style loops. It's generally simpler to code and quicker to run than a while, so it's the first tool you should reach for whenever you need to step through a sequence. But there are also situations where you will need to iterate in a more specialized way. For example, what if you need to visit every second or third item in a list, or change the list along the way? How about traversing more than one sequence in parallel, in the same for loop?

You can always code such unique iterations with a while loop and manual indexing, but Python provides two built-ins that allow you to specialize the iteration in a for:

The built-in range function returns a list of successively higher integers, which can be used as indexes in a for.
The built-in zip function returns a list a parallel-item tuples, which can be used to traverse multiple sequences in a for.

Let's look at each of these built-ins in turn.

The range function is really independent of for loops; although it's used most often to generate indexes in a for, you can use it anywhere you need a list of integers:

>>> range(5), range(2, 5), range(0, 10, 2)
([0, 1, 2, 3, 4], [2, 3, 4], [0, 2, 4, 6, 8])

With one argument, range generates a list with integers from zero up to but not including the argument's value. If you pass in two arguments, the first is taken as the lower bound. An optional third argument can give a step; if used, Python adds the step to each successive integer in the result (steps default to one). Ranges can also be nonpositive, and nonascending, if you want them to be:

>>> range(-5, 5)
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
>>> range(5, -5, -1)
[5, 4, 3, 2, 1, 0, -1, -2, -3, -4]

Although such range results may be useful all by themselves, they tend to come in most handy within for loops. For one thing, they provide a simple way to repeat an action a specific number of times. To print three lines, for example, use a range to generate the appropriate number of integers:

>>> for i in range(3):
...     print i, 'Pythons'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 11: Documenting Python Code

Content preview·Buy reprint rights for this chapter

This chapter concludes Part III with a look at techniques and tools used for documenting Python code. Although Python code is designed to be readable in general, a few well-placed human-readable comments can do much to help others understand the workings of your programs. To support comments, Python includes both syntax and tools to make documentation easier. Although this is something of a tools-related concept, the topic is presented here, partly because it involves Python's syntax model, and partly as a resource for readers struggling to understand Python's toolset. As usual, this chapter ends with pitfalls and exercises.

By this point in the book you're probably starting to realize that Python comes with an awful lot of prebuilt functionality—built-in functions, exceptions, predefined object attributes, standard library modules, and more. Moreover we've really only scratched the surface of each of these categories.

One of the first questions that bewildered beginners often ask is: how do I find information on all the built-in tools? This section provides hints on the various documentation sources available in Python. It also presents documentation strings and the PyDoc system that makes use of them. These topics are somewhat peripheral to the core language itself, but become essential knowledge as soon as your code reaches the level of the examples and exercises in this chapter.

As summarized in Table 11-1, there are a variety of places to look for information in Python, with generally increasing verbosity. Since documentation is such a crucial tool in practical programming, let's look at each of these categories.

Table 11-1: Python documentation sources
Form	Role
`#` comments	In-file documentation

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Python Documentation Interlude

Content preview·Buy reprint rights for this chapter

Table 11-1: Python documentation sources
Form	Role
`#` comments	In-file documentation
The dir function	Lists of attributes available on objects
Docstrings`: __doc__`	In-file documentation attached to objects
PyDoc: The help function	Interactive help for objects
PyDoc: HTML reports	Module documentation in a browser

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Common Coding Gotchas

Content preview·Buy reprint rights for this chapter

Before the programming exercises for this part of the book, here are some of the most common mistakes beginners make when coding Python statements and programs. You'll learn to avoid these once you've gained a bit of Python coding experience; but a few words might help you avoid falling into some of these traps initially.

Don't forget the colons. Don't forget to type a : at the end of compound statement headers (the first line of an if, while, for, etc.). You probably will at first anyhow (we did too), but you can take some comfort in the fact that it will soon become an unconscious habit.
Start in column 1. Be sure to start top-level (unnested) code in column 1. That includes unnested code typed into module files, as well as unnested code typed at the interactive prompt.
Blank lines matter at the interactive prompt. Blank lines in compound statements are always ignored in module files, but, when typing code, end the statement at the interactive prompt. In other words, blank lines tell the interactive command line that you've finished a compound statement; if you want to continue, don't hit the Enter key at the ... prompt until you're really done.
Indent consistently. Avoid mixing tabs and spaces in the indentation of a block, unless you know what your text editor system does with tabs. Otherwise, what you see in your editor may not be what Python sees when it counts tabs as a number of spaces. It's safer to use all tabs or all spaces for each block.
Don't code C in Python. A note to C/C++ programmers: you don't need to type parentheses around tests in if and while headers (e.g., if (X==1):); you can if you like (any expression can be enclosed in parentheses), but they are fully superfluous in this context. Also, do not terminate all your statements with a semicolon; it's technically legal to do this in Python as well, but is totally useless, unless you're placing more than one statement on a single line (the end of a line normally terminates a statement). And remember, don't embed assignment statements in

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part III Exercises

Content preview·Buy reprint rights for this chapter

Now that you know how to code basic program logic, the exercises ask you to implement some simple tasks with statements. Most of the work is in exercise 4, which lets you explore coding alternatives. There are always many ways to arrange statements, and part of learning Python is learning which arrangements work better tHan others.

See Section B.3 for the solutions.

Coding basic loops.
1. Write a for loop that prints the ASCII code of each character in a string named S. Use the built-in function ord(character) to convert each character to an ASCII integer. (Test it interactively to see how it works.)
2. Next, change your loop to compute the sum of the ASCII codes of all characters in a string.
3. Finally, modify your code again to return a new list that contains the ASCII codes of each character in the string. Does this expression have a similar effect—map(ord,S)? (Hint: see Part IV.)
Backslash characters. What happens on your machine when you type the following code interactively?
```
for i in range(50):
    print 'hello %d\n\a' % i
```
Beware that if run outside of the IDLE interface, this example may beep at you, so you may not want to run it in a crowded lab. IDLE prints odd characters instead (see the backslash escape characters in Table 5-2).
Sorting dictionaries. In Chapter 6, we saw that dictionaries are unordered collections. Write a for loop that prints a dictionary's items in sorted (ascending) order. Hint: use the dictionary keys and list sort methods.
Program logic alternatives. Consider the following code, which uses a while loop and found flag to search a list of powers of 2, for the value of 2 raised to the 5th power (32). It's stored in a module file called power.py.
```
L = [1, 2, 4, 8, 16, 32, 64]
X = 5
found = i = 0
while not found and i < len(L):
    if 2 ** X == L[i]:
        found = 1
    else:
        i = i+1
if found:
    print 'at index', i
else:
    print X, 'not found'
C:\book\tests> python power.py
at index 5
```
As is, the example doesn't follow normal Python coding techniques. Follow the steps below to improve it. For all the transformations, you may type your code interactively or store it in a script file run from the system command line (using a file makes this exercise much easier).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 12: Function Basics

Content preview·Buy reprint rights for this chapter

In Part III, we looked at basic procedural statements in Python. Here, we'll move on to explore a set of additional statements that create functions of our own. In simple terms, a function is a device that groups a set of statements, so they can be run more than once in a program. Functions also let us specify parameters that serve as function inputs, and may differ each time a function's code is run. Table 12-1 summarizes the primary function-related tools we'll study in this part of the book.

Table 12-1: Function-related statements and expressions
Statement	Examples
Calls	`myfunc("spam", ham, "toast")`
`def, return, yield`	`def adder(a, b=1, *c): return a+b+c[0]`
global	`def function( ): global x; x = 'new`'
lambda	`funcs = [lambda x: x*2, lambda x: x3]`

Before going into the details, let's get a clear picture of what functions are about. Functions are a nearly universal program-structuring device. Most of you have probably come across them before in other languages, where they may have been called subroutines or procedures. But as a brief introduction, functions serve two primary development roles:

Code reuse: As in most programming languages, Python functions are the simplest way to package logic you may wish to use in more than one place and more than one time. Up until now, all the code we've been writing runs immediately; functions allow us to group and generalize code to be used arbitrarily many times later.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Functions?

Content preview·Buy reprint rights for this chapter

Code reuse: As in most programming languages, Python functions are the simplest way to package logic you may wish to use in more than one place and more than one time. Up until now, all the code we've been writing runs immediately; functions allow us to group and generalize code to be used arbitrarily many times later.
Procedural decomposition: Functions also provide a tool for splitting systems into pieces that have a well- defined role. For instance, to make a pizza from scratch, you would start by mixing the dough, rolling it out, adding toppings, baking, and so on. If you were programming a pizza-making robot, functions would help you divide the overall "make pizza" task into chunks—one function for each subtask in the process. It's easier to implement the smaller tasks in isolation than it is to implement the entire process at once. In general, functions are about procedure—how to do something, rather than what you're doing it to. We'll see why this distinction matters in Part VI.

In this part of the book, we explore the tools used to code functions in Python: function basics, scope rules, and argument passing, along with a few related concepts. As we'll see, functions don't imply much new syntax, but they do lead us to some bigger programming ideas.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Coding Functions

Content preview·Buy reprint rights for this chapter

Although it wasn't made very formal, we've already been using functions in earlier chapters. For instance, to make a file object, we call the built-in open function. Similarly, we use the len built-in function to ask for the number of items in a collection object.

In this chapter, we will learn how to write new functions in Python. Functions we write behave the same way as the built-ins already seen: they are called in expressions, are passed values, and return results. But writing new functions requires a few additional ideas that haven't yet been applied. Moreover, functions behave very differently in Python than they do in compiled languages like C. Here is a brief introduction to the main concepts behind Python functions, which we will study in this chapter:

def is executable code. Python functions are written with a new statement, the def. Unlike functions in compiled languages such as C, def is an executable statement—your function does not exist until Python reaches and runs the def. In fact, it's legal (and even occasionally useful) to nest def statements inside if, loops, and even other defs. In typical operation, def statements are coded in module files, and are naturally run to generate functions when the module file is first imported.
def creates an object and assigns it to a name. When Python reaches and runs a def statement, it generates a new function object and assigns it to the function's name. As with all assignments, the function name becomes a reference to the function object. There's nothing magic about the name of a function—as we'll see, the function object can be assigned to other names, stored in a list, and so on. Functions may also be created with the lambda expression—a more advanced concept deferred until later in this chapter.
return sends a result object back to the caller. When a function is called, the caller stops until the function finishes its work and returns control to the caller. Functions that compute a value send it back to the caller with a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

A First Example: Definitions and Calls

Content preview·Buy reprint rights for this chapter

Apart from such runtime concepts (which tend to seem most unique to programmers with backgrounds in traditional compiled languages), Python functions are straightforward to use. Let's code a first real example to demonstrate the basics. Really, there are two sides to the function picture: a definition—the def that creates a function, and a call—an expression that tells Python to run the function's body.

Here's a definition typed interactively that defines a function called times, which returns the product of its two arguments:

>>> def times(x, y):      # Create and assign function.
...     return x * y      # Body executed when called.
...

When Python reaches and runs this def, it creates a new function object that packages the function's code, and assign the object to the name times. Typically, this statement is coded in a module file, and it would run when the enclosing file is imported; for something this small, though, the interactive prompt suffices.

After the def has run, the program can call (run) the function by adding parentheses after the function's name; the parentheses may optionally contain one or more object arguments, to be passed (assigned) to the names in the function's header:

>>> times(2, 4)           # Arguments in parentheses
8

This expression passes two arguments to times: the name x in the function header is assigned the value 2, y is assigned 4, and the function's body is run. In this case, the body is just a return statement, which sends back the result as the value of the call expression. The returned object is printed here interactively (as in most languages, 2*4 is 8 in Python); it could also be assigned to a variable if we need to use it later:

>>> x = times(3.14, 4)    # Save the result object.
>>> x
12.56

Now, watch what happens when the function is called a third time, with very different kinds of objects passed in:

>>> times('Ni', 4)        # Functions are "typeless."
'NiNiNiNi'

In this third call, a string and an integer are passed to x and y, instead of two numbers. Recall that * works on both numbers and sequences; because you never declare the types of variables, arguments, or return values, you can use

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

A Second Example: Intersecting Sequences

Content preview·Buy reprint rights for this chapter

Let's look at a second function example that does something a bit more useful than multiplying arguments, and further illustrates function basics.

In Chapter 10, we saw a for loop that collected items in common in two strings. We noted there that the code wasn't as useful as it could be because it was set up to work only on specific variables and could not be rerun later. Of course, you could cut and paste the code to each place it needs to be run, but this solution is neither good nor general—you'd still have to edit each copy to support different sequence names, and changing the algorithm then requires changing multiple copies.

By now, you can probably guess that the solution to this dilemma is to package the for loop inside a function. Functions offer a number of advantages over simple top-level code:

By putting the code in a function, it becomes a tool that can be run as many times as you like.
By allowing callers to pass in arbitrary arguments, you make it general enough to work on any two sequences you wish to intersect.
By packaging the logic in a function, you only have to change code in one place if you ever need to change the way intersection works.
By coding the function in a module file, it can be imported and reused by any program run on your machine.

In effect, wrapping the code in a function makes it a general intersection utility:

def intersect(seq1, seq2):
    res = [  ]                        # Start empty.
    for x in seq1:               # Scan seq1.
        if x in seq2:            # Common item?
            res.append(x)        # Add to end.
    return res

The transformation from the simple code of Chapter 10 to this function is straightforward; we've just nested the original logic under a def header and made the objects on which it operates passed-in parameter names. Since this function computes a result, we've also added a return statement to send a result object back to the caller.

Before you can call the function, you have to make the function. Run its def statement by typing it interactively, or by coding it in a module file and importing the file. Once you've run the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 13: Scopes and Arguments

Content preview·Buy reprint rights for this chapter

Chapter 12 looked at basic function definition and calls. As we've seen, the basic function model is simple to use in Python. This chapter presents the details behind Python's scopes—the places where variables are defined, as well as argument passing—the way that objects are sent to functions as inputs.

Now that you will begin to write your own functions, we need to get more formal about what names mean in Python. When you use a name in a program, Python creates, changes, or looks up the name in what is known as a namespace—a place where names live. When we talk about the search for a name's value in relation to code, the term scope refers to a namespace—the location of a name's assignment in your code determines the scope of the name's visibility to your code.

Just about everything related to names happens at assignment in Python—even scope classification. As we've seen, names in Python spring into existence when they are first assigned a value, and must be assigned before they are used. Because names are not declared ahead of time, Python uses the location of the assignment of a name to associate it with (i.e., bind it to) a particular namespace. That is, the place where you assign a name determines the namespace it will live in, and hence its scope of visibility.

Besides packaging code, functions add an extra namespace layer to your programs—by default, all names assigned inside a function are associated with that function's namespace, and no other. This means that:

Names defined inside a def can only be seen by the code within that def. You cannot even refer to such names from outside the function.
Names defined inside a def do not clash with variables outside the def, even if the same name is used elsewhere. A name X assigned outside a def is a completely different variable than a name X assigned inside the def.

The net effect is that function scopes help avoid name clashes in your programs, and help to make functions more self-contained program units.

Before you started writing functions, all code was written at the top-level of a module (i.e., not nested in a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Scope Rules

Content preview·Buy reprint rights for this chapter

Names defined inside a def can only be seen by the code within that def. You cannot even refer to such names from outside the function.
Names defined inside a def do not clash with variables outside the def, even if the same name is used elsewhere. A name X assigned outside a def is a completely different variable than a name X assigned inside the def.

The net effect is that function scopes help avoid name clashes in your programs, and help to make functions more self-contained program units.

Before you started writing functions, all code was written at the top-level of a module (i.e., not nested in a def), so the names either lived in the module itself, or were built-ins that Python predefines (e.g., open). Functions provide a nested namespace (i.e., a scope), which localizes the names they use, such that names inside the function won't clash with those outside (in a module or other function). Functions define a local scope, and modules define a global scope. The two scopes are related as follows:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The global Statement

Content preview·Buy reprint rights for this chapter

The global statement is the only thing that's remotely like a declaration statement in Python. It's not a type or size declaration, though, it's a namespace declaration. It tells Python that a function plans to change global names—names that live in the enclosing module's scope (namespace). We've talked about global in passing already; as a summary:

global means "a name at the top-level of the enclosing module file."
Global names must be declared only if they are assigned in a function.
Global names may be referenced in a function without being declared.

The global statement is just the keyword global, followed by one or more names separated by commas. All the listed names will be mapped to the enclosing module's scope when assigned or referenced within the function body. For instance:

X = 88          # Global X
def func(  ):
    global X
    X = 99      # Global X: outside def
func(  )
print X         # Prints 99

We've added a global declaration to the example here, such that the X inside the def now refers to the X outside the def; they are the same variable this time. Here is a slightly more involved example of global at work:

y, z = 1, 2         # Global variables in module
def all_global(  ):
    global x        # Declare globals assigned.
    x = y + z       # No need to declare y,z: LEGB rule

Here, x, y, and z are all globals inside the function all_global. y and z are global because they aren't assigned in the function; x is global because it was listed in a global statement to map it to the module's scope explicitly. Without the global here, x would be considered local by virtue of the assignment.

Notice that y and z are not declared global; Python's LEGB lookup rule finds them in the module automatically. Also notice that x might not exist in the enclosing module before the function runs; if not, the assignment in the function creates x in the module.

If you want to change names outside functions, you have to write extra code (global statements); by default, names assigned in functions are locals. This is by design—as is common in Python, you have to say more to do the "wrong" thing. Although there are exceptions, changing globals can lead to well-known software engineering problems: because the values of variables are dependent on the order of calls to arbitrarily distant functions, programs can be difficult to debug. Try to minimize use of globals in your code.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Scopes and Nested Functions

Content preview·Buy reprint rights for this chapter

It's time to take a deeper look at the letter "E" in the LEGB lookup rule. The "E" layer takes the form of the local scopes of any and all enclosing function defs. This layer is a relatively new addition to Python (added in Python 2.2), and is sometimes called statically nested scopes. Really, the nesting is a lexical one—nested scopes correspond to physically nested code structures in your program's source code.

With the addition of nested function scopes, variable lookup rules become slightly more complex. Within a function:

Assignment: X=value: Creates or changes name X in the current local scope by default. If X is declared global within the function, it creates or changes name X in the enclosing module's scope instead.
Reference: X: Looks for name X in the current local scope (function), then in the local scopes of all lexically enclosing functions from inner to outer (if any), then in the current global scope (the module file), and finally in the built-in scope (module __builtin__). global declarations make the search begin in the global scope instead.

Notice that the global declaration still maps variables to the enclosing module. When nested functions are present, variables in enclosing functions may only be referenced, not changed. Let's illustrate all this with some real code.

Here is an example of a nested scope:

def f1(  ):
    x = 88
    def f2(  ):
        print x
    f2(  )
f1(  )                  # Prints 88

First off, this is legal Python code: the def is simply an executable statement that can appear anywhere any other statement can—including nested in another def. Here, the nested def runs while a call to function f1 is running; it generates a function and assigns it to name f2, a local variable within f1's local scope. In a sense, f2 is a temporary function, that only lives during the execution of (and is only visible to code in) the enclosing f1.

But notice what happens inside f2: when it prints variable x, it refers to the x that lives in the enclosing f1 function's local scope. Because functions can access names in all physically enclosing

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Passing Arguments

Content preview·Buy reprint rights for this chapter

Let's expand on the notion of argument passing in Python. Earlier, we noted that arguments are passed by assignment; this has a few ramifications that aren't always obvious to beginners:

Arguments are passed by automatically assigning objects to local names. Function arguments are just another instance of Python assignment at work. Function arguments are references to (possibly) shared objects referenced by the caller.
Assigning to argument names inside a function doesn't affect the caller. Argument names in the function header become new, local names when the function runs, in the scope of the function. There is no aliasing between function argument names and names in the caller.
Changing a mutable object argument in a function may impact the caller. On the other hand, since arguments are simply assigned to passed-in objects, functions can change passed-in mutable objects, and the result may affect the caller.

Python's pass-by-assignment scheme isn't the same as C++'s reference parameters, but it turns out to be very similar to C's arguments in practice:

Immutable arguments act like C's "by value" mode. Objects such as integers and strings are passed by object reference (assignment), but since you can't change immutable objects in place anyhow, the effect is much like making a copy.
Mutable arguments act like C's "by pointer" mode. Objects such as lists and dictionaries are passed by object reference, which is similar to the way C passes arrays as pointers—mutable objects can be changed in place in the function, much like C arrays.

Of course, if you've never used C, Python's argument-passing mode will be simpler still—it's just an assignment of objects to names, which works the same whether the objects are mutable or not.

Here's an example that illustrates some of these properties at work:

>>> def changer(x, y):     # Function
...    x = 2               # Changes local name's value only

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Special Argument Matching Modes

Content preview·Buy reprint rights for this chapter

Arguments are always passed by assignment in Python; names in the def header are assigned to passed-in objects. On top of this model, though, Python provides additional tools that alter the way the argument objects in the call are matched with argument names in the header prior to assignment. These tools are all optional, but allow you to write functions that support more flexible calling patterns.

By default, arguments are matched by position, from left to right, and you must pass exactly as many arguments as there are argument names in the function header. But you can also specify a match by name, default values, and collectors for extra arguments.

Some of this section gets complicated, and before going into syntactic details, we'd like to stress that these special modes are optional and only have to do with matching objects to names; the underlying passing mechanism is still assignment, after the matching takes place. But here's a synopsis of the available matching modes:

Positionals: matched left to right: The normal case used so far is to match arguments by position.
Keywords: matched by argument name: Callers can specify which argument in the function is to receive a value by using the argument's name in the call, with a name=value syntax.
Varargs: catch unmatched positional or keyword arguments: Functions can use special arguments preceded with * characters to collect arbitrarily extra arguments (much like, and often named for, the varargs feature in the C language, which supports variable-length argument lists).
Defaults: specify values for arguments that aren't passed: Functions may also specify default values for arguments to receive if the call passes too few values, using a name=value syntax.

Table 13-1 summarizes the syntax that invokes the special matching modes.

Table 13-1: Function argument-matching forms
Syntax	Location	Interpretation

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 14: Advanced Function Topics

Content preview·Buy reprint rights for this chapter

This chapter introduces a collection of more advanced function-related topics: the lambda expression, functional programming tools such as map and list comprehensions, generators, and more. Part of the art of using functions lies in the interfaces between them, so we will also explore some general function design principles here. Because this is the last chapter in Part IV, we'll close with the usual sets of gotchas and exercises to help you start coding the ideas you've read about.

So far, we've seen what it takes to write our own functions in Python. The next sections turn to a few more advanced function-related ideas. Most of these are optional features, but can simplify your coding tasks when used well.

Besides the def statement, Python also provides an expression form that generates function objects. Because of its similarity to a tool in the LISP language, it's called lambda. Like def, this expression creates a function to be called later, but returns it instead of assigning it to a name. This is why lambdas are sometimes known as anonymous (i.e., unnamed) functions. In practice, they are often used as a way to inline a function definition, or defer execution of a piece of code.

The lambda's general form is the keyword lambda, followed by one or more arguments (exactly like the arguments list you enclose in parenthesis in a def header), followed by an expression after a colon:

lambda argument1, argument2,... argumentN : expression using arguments

Function objects returned by running lambda expressions work exactly the same as those created and assigned by def. But the lambda has a few differences that make it useful in specialized roles:

lambda is an expression, not a statement. Because of this, a lambda can appear in places a def is not allowed by Python's syntax—inside a list literal or function call, for example. As an expression, the lambda returns a value (a new function), which can be assigned a name optionally; the def statement always assigns the new function to the name in the header, instead of returning it as a result.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Anonymous Functions: lambda

Content preview·Buy reprint rights for this chapter

lambda argument1, argument2,... argumentN : expression using arguments

lambda is an expression, not a statement. Because of this, a lambda can appear in places a def is not allowed by Python's syntax—inside a list literal or function call, for example. As an expression, the lambda returns a value (a new function), which can be assigned a name optionally; the def statement always assigns the new function to the name in the header, instead of returning it as a result.
lambda bodies are a single expression, not a block of statements. The lambda's body is similar to what you'd put in a def body's return statement; simply type the result as a naked expression, instead of explicitly returning it. Because it is limited to an expression, lambda is less general than a def; you can only squeeze so much logic into a lambda body without using statements such as

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Applying Functions to Arguments

Content preview·Buy reprint rights for this chapter

Some programs need to call arbitrary functions in a generic fashion, without knowing their names or arguments ahead of time. We'll see examples of where this can be useful later, but by way of introduction, both the apply built-in function, and the special call syntax, do the job.

You can call generated functions by passing them as arguments to apply, along with a tuple of arguments:

>>> def func(x, y, z): return x + y + z
...
>>> apply(func, (2, 3, 4))
9
>>> f = lambda x, y, z: x + y + z
>>> apply(f, (2, 3, 4))
9

The function apply simply calls the passed-in function in the first argument, matching the passed-in arguments tuple to the function's expected arguments. Since the arguments list is passed in as a tuple (i.e., a data structure), it can be built at runtime by a program.

The real power of apply is that it doesn't need to know how many arguments a function is being called with; for example, you can use if logic to select from a set of functions and argument lists, and use apply to call any:

if <test>:
    action, args = func1, (1,)
else:
    action, args = func2, (1, 2, 3)
. . . 
apply(action, args)

More generally, apply is useful any time you cannot predict the arguments list ahead of time. If your user selects an arbitrary function via a user interface, for instance, you may be unable to hardcode a function call when writing your script. Simply build up the arguments list with tuple operations and call indirectly through apply:

>>> args = (2,3) + (4,)
>>> args
(2, 3, 4)
>>> apply(func, args)
9

Section 14.2.1.1: Passing keyword arguments

The apply call also supports an optional third argument, where you can pass in a dictionary that represents keyword arguments to be passed to the function:

>>> def echo(*args, **kwargs): print args, kwargs
>>> echo(1, 2, a=3, b=4)
(1, 2) {'a': 3, 'b': 4}

This allows us to construct both positional and keyword arguments, at runtime:

>>> pargs = (1, 2)
>>> kargs = {'a':3, 'b':4}
>>> apply(echo, pargs, kargs)
(1, 2) {'a': 3, 'b': 4}

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Mapping Functions Over Sequences

Content preview·Buy reprint rights for this chapter

One of the more common things programs do with lists and other sequences is to apply an operation to each item, and collect the results. For instance, updating all the counters in a list can be done easily with a for loop:

>>> counters = [1, 2, 3, 4]
>>>
>>> updated = [  ]
>>> for x in counters:
...     updated.append(x + 10)              # Add 10 to each item.
...
>>> updated
[11, 12, 13, 14]

Because this is such a common operation, Python provides a built-in that does most of the work for you. The map function applies a passed-in function to each item in a sequence object, and returns a list containing all the function call results. For example:

>>> def inc(x): return x + 10               # function to be run
...
>>> map(inc, counters)                      # Collect results.
[11, 12, 13, 14]

We introduced map as a parallel loop traversal tool in Chapter 10, where we passed in None for the function argument to pair items up. Here, we make better use of it by passing in a real function to be applied to each item in the list—map calls inc on each list item, and collects all the return values into a list.

Since map expects a function to be passed in, it also happens to be one of the places where lambdas commonly appear:

>>> map((lambda x: x + 3), counters)        # Function expression
[4, 5, 6, 7]

Here, the function adds 3 to each item in the counters list; since this function isn't needed elsewhere, it was written inline as a lambda. Because such uses of map are equivalent to for loops, with a little extra code, you could always code a general mapping utility yourself:

>>> def mymap(func, seq):
...     res = [  ]
...     for x in seq: res.append(func(x))
...     return res
...
>>> map(inc, [1, 2, 3])
[11, 12, 13]
>>> mymap(inc, [1, 2, 3])
[11, 12, 13]

However, since map is a built-in, it's always available, always works the same way, and has some performance benefits (in short, it's faster than a for). Moreover, map can be used in more advanced ways than shown; for instance, given multiple sequence arguments, it sends items taken from sequences in parallel as distinct arguments to the function:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Functional Programming Tools

Content preview·Buy reprint rights for this chapter

The map function is the simplest representative of a class of Python built-ins used for functional programming—which mostly just means tools that apply functions to sequences. Its relatives filter out items based on a test function (filter), and apply functions to pairs of items and running results (reduce). For example, the following filter call picks out items in a sequence greater than zero:

>>> range(-5, 5)
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
>>> filter((lambda x: x > 0), range(-5, 5))
[1, 2, 3, 4]

Items in the sequence for which the function returns true are added to the result list. Like map, it's roughly equivalent to a for loop, but is built-in and fast:

>>> res = [  ]
>>> for x in range(-5, 5):
...     if x > 0:
...         res.append(x)
...
>>> res
[1, 2, 3, 4]

Here are two reduce calls computing the sum and product of items in a list:

>>> reduce((lambda x, y: x + y), [1, 2, 3, 4])
10
>>> reduce((lambda x, y: x * y), [1, 2, 3, 4])
24

At each step, reduce passes the current sum or product, along with the next item from the list, to the passsed in lambda function. By default, the first item in the sequence initializes the starting value. Here's the for loop equivalent to the first of these, with the addition hardcoded inside the loop:

>>> L = [1,2,3,4]
>>> res = L[0]
>>> for x in L[1:]:
...     res = res + x
...
>>> res
10

If this has sparked your interest, also see the built-in operator module, which provides functions that correspond to built-in expressions, and so comes in handy for some uses of functional tools:

>>> import operator
>>> reduce(operator.add, [2, 4, 6])      # function-based +
12
>>> reduce((lambda x, y: x + y), [2, 4, 6])
12

Some observers might also extend the functional programming toolset in Python to include lambda and apply, and list comprehensions (discussed in the next section).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

List Comprehensions

Content preview·Buy reprint rights for this chapter

Because mapping operations over sequences and collecting results is such a common task in Python coding, Python 2.0 sprouted a new feature—the list comprehension expression—that can make this even simpler than using map and filter. Technically, this feature is not tied to functions, but we've saved it for this point in the book, because it is usually best understood by analogy to function-based alternatives.

Let's work through an example that demonstrates the basics. Python's built-in ord function returns the integer ASCII code of a single character:

>>> ord('s')
115

The chr built-in is the converse—it returns the character for an ASCII code integer. Now, suppose we wish to collect the ASCII codes of all characters in an entire string. Perhaps the most straightforward approach is to use a simple for loop, and append results to a list:

>>> res = [  ]
>>> for x in 'spam': 
...     res.append(ord(x))
...
>>> res
[115, 112, 97, 109]

Now that we know about map, we can achieve similar results with a single function call without having to manage list construction in the code:

>>> res = map(ord, 'spam')            # Apply func to seq.
>>> res
[115, 112, 97, 109]

But as of Python 2.0, we get the same results from a list comprehension expression:

>>> res = [ord(x) for x in 'spam']    # Apply expr to seq.
>>> res
[115, 112, 97, 109]

List comprehensions collect the results of applying an arbitrary expression to a sequence of values, and return them in a new list. Syntactically, list comprehensions are enclosed in square brackets (to remind you that they construct a list). In their simple form, within the brackets, you code an expression that names a variable, followed by what looks like a for loop header that names the same variable. Python collects the expression's results, for each iteration of the implied loop.

The effect of the example so far is similar to both the manual for loop, and the map call. List comprehensions become more handy, though, when we wish to apply an arbitrary expression to a sequence:

>>> [x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Generators and Iterators

Content preview·Buy reprint rights for this chapter

It is possible to write functions that may be resumed after they send a value back. Such functions are known as generators because they generate a sequence of values over time. Unlike normal functions that return a value and exit, generator functions automatically suspend and resume their execution and state around the point of value generation. Because of that, they are often a useful alternative to both computing an entire series of values up front, and manually saving and restoring state in classes.

The chief code difference between generator and normal functions is that generators yield a value, rather than returning one—the yield statement suspends the function and sends a value back to the caller, but retains enough state to allow the function to resume from where it left off. This allows functions to produce a series of values over time, rather than computing them all at once, and sending them back in something like a list.

Generator functions are bound up with the notion of iterator protocols in Python. In short, functions containing a yield statement are compiled specially as generators; when called, they return a generator object that supports the iterator object interface.

Iterator objects, in turn, define a next method, which returns the next item in the iteration, or raises a special exception (StopIteration) to end the iteration. Iterators are fetched with the iter built-in function. Python for loops use this iteration interface protocol to step through a sequence (or sequence generator), if the protocol is supported; if not, for falls back on repeatedly indexing sequences instead.

Generators and iterators are an advanced language feature, so please see the Python library manuals for the full story on generators.

To illustrate the basics, though, the following code defines a generator function that can be used to generate the squares of a series of numbers over time:

>>> def gensquares(N):
...     for i in range(N):
...         yield i ** 2               # Resume here later.

This function yields a value, and so returns to its caller, each time through the loop; when it is resumed, its prior state is restored, and control picks up again immediately after the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Function Design Concepts

Content preview·Buy reprint rights for this chapter

When you start using functions, you're faced with choices about how to glue components together—for instance, how to decompose a task into functions (cohesion ), how functions should communicate (coupling ), and so on. Some of this falls into the category of structured analysis and design. Here are a few general hints for Python beginners:

Coupling: use arguments for inputs and return for outputs. Generally, you should strive to make a function independent of things outside of it. Arguments and return statements are often the best ways to isolate external dependencies.
Coupling: use global variables only when truly necessary. Global variables (i.e., names in the enclosing module) are usually a poor way for functions to communicate. They can create dependencies and timing issues that make programs difficult to debug and change.
Coupling: don't change mutable arguments unless the caller expects it. Functions can also change parts of mutable objects passed in. But as with global variables, this implies lots of coupling between the caller and callee, which can make a function too specific and brittle.
Cohesion: each function should have a single, unified purpose. When designed well, each of your functions should do one thing—something you can summarize in a simple declarative sentence. If that sentence is very broad (e.g., "this function implements my whole program"), or contains lots of conjunctions (e.g., "this function gives employee raises and submits a pizza order"), you might want to think about splitting it into separate and simpler functions. Otherwise, there is no way to reuse the code behind the steps mixed together in such a function.
Size: each function should be relatively small. This naturally follows from the cohesion goal, but if your functions start spanning multiple pages on your display, it's probably time to split. Especially given that Python code is so concise to begin with, a function that grows long or deeply nested is often a symptom of design problems. Keep it simple, and keep it short.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Function Gotchas

Content preview·Buy reprint rights for this chapter

Here are some of the more jagged edges of functions you might not expect. They're all obscure, and a few have started to fall away from the language completely in recent releases, but most have been known to trip up a new user.

Python classifies names assigned in a function as locals by default; they live in the function's scope and exist only while the function is running. What we didn't tell you is that Python detects locals statically, when it compiles the def's code, rather than by noticing assignments as they happen at runtime. This leads to one of the most common oddities posted on the Python newsgroup by beginners.

Normally, a name that isn't assigned in a function is looked up in the enclosing module:

>>> X = 99
>>> def selector(  ):        # X used but not assigned
...     print X              # X found in global scope
...
>>> selector(  )
99

Here, the X in the function resolves to the X in the module outside. But watch what happens if you add an assignment to X after the reference:

>>> def selector(  ):
...     print X              # Does not yet exist!
...     X = 88               # X classified as a local name (everywhere)
...                          # Can also happen if "import X", "def X",...
>>> selector(  )
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in selector
UnboundLocalError: local variable 'X' referenced before assignment

You get an undefined name error, but the reason is subtle. Python reads and compiles this code when it's typed interactively or imported from a module. While compiling, Python sees the assignment to X and decides that X will be a local name everywhere in the function. But later, when the function is actually run, the assignment hasn't yet happened when the print executes, so Python says you're using an undefined name. According to its name rules, it should; local X is used before being assigned. In fact, any assignment in a function body makes a name local. Imports, =, nested defs, nested classes, and so on, are all susceptible to this behavior.

The problem occurs because assigned names are treated as locals everywhere in a function, not just after statements where they are assigned. Really, the previous example is ambiguous at best: did you mean to print the global

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part IV Exercises

Content preview·Buy reprint rights for this chapter

We're going to start coding more sophisticated programs in these exercises. Be sure to check solutions in Section B.4, and be sure to start writing your code in module files. You won't want to retype these exercises from scratch if you make a mistake.

The basics. At the Python interactive prompt, write a function that prints its single argument to the screen and call it interactively, passing a variety of object types: string, integer, list, dictionary. Then try calling it without passing any argument. What happens? What happens when you pass two arguments?
Arguments. Write a function called adder in a Python module file. The function adder should accept two arguments and return the sum (or concatenation) of its two arguments. Then add code at the bottom of the file to call the function with a variety of object types (two strings, two lists, two floating points), and run this file as a script from the system command line. Do you have to print the call statement results to see results on your screen?
varargs. Generalize the adder function you wrote in the last exercise to compute the sum of an arbitrary number of arguments, and change the calls to pass more or less than two. What type is the return value sum? (Hints: a slice such as S[:0] returns an empty sequence of the same type as S, and the type built-in function can test types; but see the min examples in Chapter 13 for a simpler approach.) What happens if you pass in arguments of different types? What about passing in dictionaries?
Keywords. Change the adder function from Exercise 2 to accept and add three arguments: def adder(good, bad, ugly). Now, provide default values for each argument and experiment with calling the function interactively. Try passing one, two, three, and four arguments. Then, try passing keyword arguments. Does the call adder(ugly=1, good=2) work? Why? Finally, generalize the new adder to accept and add an arbitrary number of keyword arguments, much like Exercise 3, but you'll need to iterate over a dictionary, not a tuple. (Hint: the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 15: Modules: The Big Picture

Content preview·Buy reprint rights for this chapter

This chapter begins our look at the Python module, the highest-level program organization unit, which packages program code and data for reuse. In concrete terms, modules usually correspond to Python program files (or C extensions). Each file is a module, and modules import other modules to use the names they define. Modules are processed with two new statements and one important built-in function:

import: Lets a client (importer) fetch a module as a whole
from: Allows clients to fetch particular names from a module
reload: Provides a way to reload a module's code without stopping Python

We introduced module fundamentals in Chapter 3, and have been using them ever since. Part V begins by expanding on core module concepts, and then moves on to explore more advanced module usage. This first chapter begins with a general look at the role of modules in overall program structure. In the next and following chapters, we'll dig into the coding details behind the theory.

Along the way, we'll flesh out module details we've omitted so far: reloads, the __name__ and __all__ attributes, package imports, and so on. Because modules and classes are really just glorified namespaces, we formalize namespace concepts here as well.

Modules provide an easy way to organize components into a system, by serving as packages of names. From an abstract perspective, modules have at least three roles:

Code reuse: As we saw in Chapter 3, modules let us save code in files permanently. Unlike code you type at the Python interactive prompt, which goes away when you exit Python, code in module files is persistent—it can be reloaded and rerun as many times as needed. More to the point, modules are a place to define names, or attributes, that may be referenced by external clients.
System namespace partitioning: Modules are also the highest-level program organization unit in Python. Fundamentally, they are just packages of names. Modules seal up names into self-contained packages that avoid name clashes—you can never see a name in another file, unless you explicitly import it. In fact, everything "lives" in a module: code you execute and objects you create are always implicitly enclosed by a module. Because of that, modules are a natural tool for grouping system components.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Modules?

Content preview·Buy reprint rights for this chapter

Modules provide an easy way to organize components into a system, by serving as packages of names. From an abstract perspective, modules have at least three roles:

Code reuse: As we saw in Chapter 3, modules let us save code in files permanently. Unlike code you type at the Python interactive prompt, which goes away when you exit Python, code in module files is persistent—it can be reloaded and rerun as many times as needed. More to the point, modules are a place to define names, or attributes, that may be referenced by external clients.
System namespace partitioning: Modules are also the highest-level program organization unit in Python. Fundamentally, they are just packages of names. Modules seal up names into self-contained packages that avoid name clashes—you can never see a name in another file, unless you explicitly import it. In fact, everything "lives" in a module: code you execute and objects you create are always implicitly enclosed by a module. Because of that, modules are a natural tool for grouping system components.
Implementing shared services or data: From a functional perspective, modules also come in handy for implementing components that are shared across a system, and hence only require a single copy. For instance, if you need to provide a global object that's used by more than one function or file, you can code it in a module that's imported by many clients.

To truly understand the role of modules in a Python system, though, we need to digress for a moment and explore the general structure of a Python program.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python Program Architecture

Content preview·Buy reprint rights for this chapter

So far in this book, we've sugar-coated some of the complexity in our descriptions of Python programs. In practice, programs usually are more than just one file; for all but the simplest scripts, your programs will take the form of multifile systems. And even if you can get by with coding a single file yourself, you will almost certainly wind up using external files that someone else has already written.

This section introduces the general architecture of Python programs—the way you divide a program into a collection of source files (a.k.a. modules), and link the parts into a whole. Along the way, we also define the central concepts of Python modules, imports, and object attributes.

Generally, a Python program consists of multiple text files containing Python statements. The program is structured as one main, top-level file, along with zero or more supplemental files known as modules in Python.

In a Python program, the top-level file contains the main flow of control of your program—the file you run to launch your application. The module files are libraries of tools, used to collect components used by the top-level file, and possibly elsewhere. Top-level files use tools defined in module files, and modules use tools defined in other modules. In Python, a file imports a module to gain access to the tools it defines. And the tools defined by a module are known as its attributes—variable names attached to objects such as functions. Ultimately, we import modules, and access their attributes to use their tools.

Let's make this a bit more concrete. Figure 15-1 sketches the structure of a Python program composed of three files: a.py, b.py, and c.py. The file a.py is chosen to be the top-level file; it will be a simple text file of statements, which is executed from top to bottom when launched. Files b.py and c.py are modules; they are simple text files of statements as well, but are usually not launched directly. Rather, modules are normally imported by other files that wish to use the tools they define.

Figure 15-1:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

How Imports Work

Content preview·Buy reprint rights for this chapter

The prior section talked about importing modules, without really explaining what happens when you do so. Since imports are at the heart of program structure in Python, this section goes into more detail on the import operation to make this process less abstract.

Some C programmers like to compare the Python module import operation to a C #include, but they really shouldn't—in Python, imports are not just textual insertions of one file into another. They are really runtime operations that perform three distinct steps the first time a file is imported by a program:

Find the module's file.
Compile it to byte-code (if needed).
Run the module's code to build the objects it defines.

All three of these steps are only run the first time a module is imported during a program's execution; later imports of the same module bypass all of these and simply fetch the already-loaded module object in memory. To better understand module imports, let's explore each of these steps in turn.

First off, Python must locate the module file referenced by your import statement. Notice the import statement in the prior section's example names the file without a .py suffix and without its directory path. It says just import b, instead of something like import c:\dir1\b.py. Import statements omit path and suffix details like this on purpose; you can only list a simple name. Instead, Python uses a standard module search path to locate the module file corresponding to an import statement.

Section 15.3.1.1: The module search path

In many cases, you can rely on the automatic nature of the module import search path and need not configure this path at all. If you want to be able to import files across user-defined directory boundaries, though, you will need to know how the search path works, in order to customize it. Roughly, Python's module search path is automatically composed as the concatenation of these major components:

The home directory of the top-level file.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 16: Module Coding Basics

Content preview·Buy reprint rights for this chapter

Now that we've looked at the larger ideas behind modules, let's turn to a simple example of modules in action. Python modules are easy to create; they're just files of Python program code, created with your text editor. You don't need to write special syntax to tell Python you're making a module; almost any text file will do. Because Python handles all the details of finding and loading modules, modules are also easy to use; clients simply import a module, or specific names a module defines, and use the objects they reference.

To define a module, use your text editor to type Python code into a text file. Names assigned at the top level of the module become its attributes (names associated with the module object), and are exported for clients to use. For instance, if we type the def below into a file called module1.py and import it, we create a module object with one attribute—the name printer, which happens to be a reference to a function object:

def printer(x):           # Module attribute
    print x

A word on module filenames: you can call modules just about anything you like, but module filenames should end in a .py suffix if you plan to import them. The .py is technically optional for top-level files that will be run, but not imported; but adding it in all cases makes the file's type more obvious.

Since module names become variables inside a Python program without the .py, they should also follow the normal variable name rules we learned in Chapter 8. For instance, you can create a module file named if.py, but cannot import it, because if is a reserved word—when you try to run import if, you'll get a syntax error. In fact, both the names of module files and directories used in package imports must conform to the rules for variable names presented in Chapter 8. This becomes a larger concern for package directories; their names cannot contain platform-specific syntax such as spaces.

When modules are imported, Python maps the internal module name to an external filename, by adding directory paths in the module search path to the front, and a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Creation

Content preview·Buy reprint rights for this chapter

def printer(x):           # Module attribute
    print x

When modules are imported, Python maps the internal module name to an external filename, by adding directory paths in the module search path to the front, and a .py or other extension at the end. For instance, a module name M ultimately maps to some external file <directory>\M.<extension> that contains our module's code.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Usage

Content preview·Buy reprint rights for this chapter

Clients can use the module file we just wrote by running import or from statements. Both find, compile, and run a module file's code if it hasn't yet been loaded. The chief difference is that import fetches the module as a whole, so you must qualify to fetch its names; instead, from fetches (or copies) specific names out of the module.

Let's see what this means in terms of code. All of the following examples wind up calling the printer function defined in the external module file module1.py, but in different ways.

In the first example, the name module1 serves two different purposes. It identifies an external file to be loaded and becomes a variable in the script, which references the module object after the file is loaded:

>>> import module1                    # Get module as a whole.
>>> module1.printer('Hello world!')   # Qualify to get names.
Hello world!

Because import gives a name that refers to the whole module object, we must go through the module name to fetch its attributes (e.g., module1.printer).

By contrast, because from also copies names from one file over to another scope, we instead use the copied names directly without going through the module (e.g., printer):

>>> from module1 import printer       # Copy out one variable.
>>> printer('Hello world!')           # No need to qualify name.
Hello world!

Finally, the next example uses a special form of from: when we use a *, we get copies of all the names assigned at the top level of the referenced module. Here again, we use the copied name, and don't go through the module name:

>>> from module1 import *             # Copy out all variables.
>>> printer('Hello world!')
Hello world!

Technically, both import and from statements invoke the same import operation; from simply adds an extra copy-out step.

And that's it; modules really are simple to use. But to give you a better understanding of what really happens when you define and use modules, let's move on to look at some of their properties in more detail.

One of the most common questions beginners seem to ask when using modules is: why won't my imports keep working? The first import works fine, but later imports during an interactive session (or program run) seem to have no effect. They're not supposed to, and here's why.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Namespaces

Content preview·Buy reprint rights for this chapter

Modules are probably best understood as simply packages of names—places to define names you want to make visible to the rest of a system. In Python, modules are a namespace—a place where names are created. Names that live in a module are called its attributes. Technically, modules usually correspond to files, and Python creates a module object to contain all the names assigned in the file; but in simple terms, modules are just namespaces.

So how do files morph into namespaces? The short story is that every name that is assigned a value at the top level of a module file (i.e., not nested in a function or class body) becomes an attribute of that module.

For instance, given an assignment statement such as X=1 at the top level of a module file M.py, the name X becomes an attribute of M, which we can refer to from outside the module as M.X. The name X also becomes a global variable to other code inside M.py, but we need to explain the notion of module loading and scopes a bit more formally to understand why:

Module statements run on the first import. The first time a module is imported anywhere in a system, Python creates an empty module object and executes the statements in the module file one after another, from the top of the file to the bottom.
Top-level assignments create module attributes. During an import, statements at the top-level of the file that assign names (e.g., =, def) create attributes of the module object; assigned names are stored in the module's namespace.
Module namespace: attribute __dict__, or dir(M). Module namespaces created by imports are dictionaries; they may be accessed through the built-in __dict__ attribute associated with module objects and may be inspected with the dir function. The dir function is roughly equivalent to the sorted keys list of an object's __dict__ attribute, but includes inherited names for classes, may not be complete, and is prone to change from release to release.
Modules are a single scope (local is global).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Reloading Modules

Content preview·Buy reprint rights for this chapter

A module's code is run only once per process by default. To force a module's code to be reloaded and rerun, you need to ask Python explicitly to do so, by calling the reload built-in function. In this section, we'll explore how to use reloads to make your systems more dynamic. In a nutshell:

Imports (both import and from statements) load and run a module's code only the first time the module is imported in a process.
Later imports use the already loaded module object without reloading or rerunning the file's code.
The reload function forces an already loaded module's code to be reloaded and rerun. Assignments in the file's new code change the existing module object in-place.

Why all the fuss about reloading modules? The reload function allows parts of programs to be changed without stopping the whole program. With reload, the effects of changes in components can be observed immediately. Reloading doesn't help in every situation, but where it does, it makes for a much shorter development cycle. For instance, imagine a database program that must connect to a server on startup; since program changes can be tested immediately after reloads, you need to connect only once while debugging.

Because Python is interpreted (more or less), it already gets rid of the compile/link steps you need to go through to get a C program to run: modules are loaded dynamically, when imported by a running program. Reloading adds to this, by allowing you to also change parts of running programs without stopping. We should note that reload currently only works on modules written in Python; C extension modules can be dynamically loaded at runtime too, but they can't be reloaded.

Unlike import and from:

reload is a built-in function in Python, not a statement.
reload is passed an existing module object, not a name.

Because reload expects an object, a module must have been previously imported successfully before you can reload it. In fact, if the import was unsuccessful due to a syntax or other error, you may need to repeat an import before you can reload. Furthermore, the syntax of import statements and

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 17: Module Packages

Content preview·Buy reprint rights for this chapter

So far, when we've imported a module, we've been loading files. This represents typical module usage, and is what you will probably use for most imports you'll code early on in your Python career. The module import story is a bit more rich than we have thus far implied.

Imports can name a directory path, in addition to a module name. When they do, they are known as package imports—a directory of Python code is said to be a package. This is a somewhat advanced feature, but turns out to be handy for organizing the files in a large system, and tends to simplify module search path settings. As we'll see, package imports are also sometimes required in order to resolve ambiguities when multiple programs are installed on a single machine.

Here's how package imports work. In the place where we have been naming a simple file in import statements, we can instead list a path of names separated by periods:

import dir1.dir2.mod

The same goes for from statements:

from dir1.dir2.mod import x

The "dotted" path in these statements is assumed to correspond to a path through the directory hierarchy on your machine, leading to the file mod.py (or other file type). That is, there is directory dir1, which has a subdirectory dir2, which contains a module file mod.py (or other suffix).

Furthermore, these imports imply that dir1 resides within some container directory dir0, which is accessible on the Python module search path. In other words, the two import statements imply a directory structure that looks something like this (shown with DOS backslash separators):

               dir0\dir1\dir2\mod.py             # Or mod.pyc,mod.so,...

The container directory dir0 still needs to be added to your module search path (unless it's the home directory of the top-level file), exactly as if dir1 were a module file. From there down the import statements in your script give the directory path leading to the module explicitly.

If you use this feature, keep in mind that the directory paths in your import statements can only be variables separated by periods. You cannot use any platform-specific path syntax in your import statements; things like

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Package Import Basics

Content preview·Buy reprint rights for this chapter

Here's how package imports work. In the place where we have been naming a simple file in import statements, we can instead list a path of names separated by periods:

import dir1.dir2.mod

The same goes for from statements:

from dir1.dir2.mod import x

               dir0\dir1\dir2\mod.py             # Or mod.pyc,mod.so,...

My
Documents.dir2

, and ../dir1, do not work syntactically. Instead, use platform-specific syntax in your module search path settings to name the container directory.

For instance, in the prior example, dir0—the directory name you add to your module search path—can be an arbitrarily long and platform-specific directory path leading up to dir1. Instead of using an invalid statement like this:

import C:\mycode\dir1\dir2\mod      # Error: illegal syntax

add C:\mycode to your PYTHONPATH variable or .pth files, unless it is the program's home directory, and say this:

import dir1.dir2.mod

In effect, entries on the module search path provide platform-specific directory path prefixes, which lead to the leftmost names in import statements. Import statements provide directory path tails in a platform neutral fashion.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Package Import Example

Content preview·Buy reprint rights for this chapter

Let's actually code the example we've been talking about to show how initialization files and paths come into play. The following three files are coded in a directory dir1 and its subdirectory dir2:

#File: dir1\__init__.py
print 'dir1 init'
x = 1
#File: dir1\dir2\__init__.py
print 'dir2 init'
y = 2
#File: dir1\dir2\mod.py
print 'in mod.py'
z = 3

Here, dir1 will either be a subdirectory of the one we're working in (i.e., the home directory), or a subdirectory of a directory that is listed on the module search path (technically, on sys.path). Either way, dir1's container does not need an __init__.py file.

As for simple module files, import statements run each directory's initialization file as Python descends the path, the first time a directory is traversed; we've added print statements to trace their execution. Also like module files, already-imported directories may be passed to reload to force re-execution of that single item—reload accepts a dotted path name to reload nested directories and files:

% python 
>>> import dir1.dir2.mod      # First imports run init files.
dir1 init
dir2 init
in mod.py
>>>
>>> import dir1.dir2.mod      # Later imports do not.
>>>
>>> reload(dir1)
dir1 init
<module 'dir1' from 'dir1\__init__.pyc'>
>>>
>>> reload(dir1.dir2)
dir2 init
<module 'dir1.dir2' from 'dir1\dir2\__init__.pyc'>

Once imported, the path in your import statement becomes a nested object path in your script; mod is an object nested in object dir2, nested in object dir1:

>>> dir1
<module 'dir1' from 'dir1\__init__.pyc'>
>>> dir1.dir2
<module 'dir1.dir2' from 'dir1\dir2\__init__.pyc'>
>>> dir1.dir2.mod
<module 'dir1.dir2.mod' from 'dir1\dir2\mod.pyc'>

In fact, each directory name in the path becomes a variable, assigned to a module object whose namespace is initialized by all the assignments in that directory's __init__.py file. dir1.x refers to the variable x assigned in dir1\__init__.py, much as mod.z refers to z assigned in mod.py:

>>> dir1.x
1
>>> dir1.dir2.y
2
>>> dir1.dir2.mod.z
3

import statements can be somewhat inconvenient to use with packages, because you must retype paths frequently in your program. In the prior section's example, you must retype and rerun the full path from

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Package Imports?

Content preview·Buy reprint rights for this chapter

If you're new to Python, make sure that you've mastered simple modules before stepping up to packages, as they are a somewhat advanced feature of Python. They do serve useful roles, especially in larger programs: they make imports more informative, serve as an organizational tool, simplify your module search path, and can resolve ambiguities.

First of all, because package imports give some directory information in program files, they both make it easier to locate your files, and serve as an organizational tool. Without package paths, you must resort to consulting the module search to find files more often. Moreover, if you organize your files into subdirectories for functional areas, package imports make it more obvious what role a module plays, and so make your code more readable. For example, a normal import of a file in a directory somewhere on the module search path:

import utilities

bears much less information than an import that includes path information:

import database.client.utilities

Package imports can also greatly simply your PYTHONPATH or .pth file search path settings. In fact, if you use package imports for all your cross-directory imports, and you make those package imports relative to a common root directory where all your Python code is stored, you really only need a single entry on your search path: the common root.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

A Tale of Three Systems

Content preview·Buy reprint rights for this chapter

The only time package imports are actually required, though, is in order to resolve ambiguities that may arise when multiple programs are installed on a single machine. This is something of an install issue, but can also become a concern in general practice. Let's turn to a hypothetical scenario to illustrate.

Suppose that a programmer develops a Python program that contains a file called utilities.py for common utility code, and a top-level file named main.py that users launch to start the program. All over this program, its files say import utilities to load and use the common code. When this program is shipped, it arrives as a single tar or zip file containing all the program's files; when it is installed, it unpacks all its files into a single directory named system1 on the target machine:

system1\
    utilities.py        # Common utility functions, classes
    main.py             # Launch this to start the program.
    other.py            # Import utilities to load my tools

Now, suppose that a second programmer does the same thing: he or she develops a different program with files utilities.py and main.py, and uses import utilities to load the common code file again. When this second system is fetched and installed, its files unpack into a new directory called system2 somewhere on the receiving machine, such that its files do not overwrite same-named files from the first system. Eventually, both systems become so popular that they wind up commonly installed in the same computer:

system2\
    utilities.py        # Common utilities
    main.py             # Launch this to run.
    other.py            # Imports utilities

So far, there's no problem: both systems can coexist or run on the same machine. In fact, we don't even need to configure the module search path to use these programs—because Python always searches the home directory first (that is, the directory containing the top-level file), imports in either system's files will automatically see all the files in that system's directory. For instance, if you click on system1\main.py, all imports will search system1

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 18: Advanced Module Topics

Content preview·Buy reprint rights for this chapter

Part V concludes with a collection of more advanced module-related topics, along with the standard set of gotchas and exercises. Just like funtions, modules are more effective when their interfaces are defined well, so this chapter also takes a brief look at module design concepts. Some of the topics here, such as the __name__ trick, are very widely used, despite the word "advanced" in this chapter's title.

As we've seen, Python modules export all names assigned at the top level of their file. There is no notion of declaring which names should and shouldn't be visible outside the module. In fact, there's no way to prevent a client from changing names inside a module if they want to.

In Python, data hiding in modules is a convention, not a syntactical constraint. If you want to break a module by trashing its names, you can, but we have yet to meet a programmer who would want to. Some purists object to this liberal attitude towards data hiding and claim that it means Python can't implement encapsulation. However, encapsulation in Python is more about packaging than about restricting.

As a special case, prefixing names with a single underscore (e.g., _X) prevents them from being copied out when a client imports with a from* statement. This really is intended only to minimize namespace pollution; since from* copies out all names, you may get more than you bargained for (including names that overwrite names in the importer). But underscores aren't "private" declarations: you can still see and change such names with other import forms such as the import statement.

A module can achieve a hiding effect similar to the _X naming convention, by assigning a list of variable name strings to the variable __all__ at the top level of the module. For example:

__all__ = ["Error", "encode", "decode"]     # Export these only.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Data Hiding in Modules

Content preview·Buy reprint rights for this chapter

A module can achieve a hiding effect similar to the _X naming convention, by assigning a list of variable name strings to the variable __all__ at the top level of the module. For example:

__all__ = ["Error", "encode", "decode"]     # Export these only.

When this feature is used, the from* statement will only copy out those names listed in the __all__ list. In effect, this is the converse of the _X convention: __all__ contains names to be copied, but _X identifies names to not be copied. Python looks for an __all__ list in the module first; if one is not defined, from* copies all names without a single leading underscore.

The __all__ list also only has meaning to the from* statement form, and is not a privacy declaration. Module writers can use either trick, to implement modules that are well-behaved when used with from*. See the discussion of __all__ lists in package __init__.py files in Chapter 17; there, they declare submodules to be loaded for a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Enabling Future Language Features

Content preview·Buy reprint rights for this chapter

Changes to the language that may potentially break existing code in the future are introduced gradually. Initially, they appear as optional extensions, which are disabled by default. To turn on such extensions, use a special import statement of this form:

from __future__ import featurename

This statement should generally appear at the top of a module file (possibly after a docstring), because it enables special compilation of code on a per-module basis. It's also possible to submit this statement at the interactive prompt to experiment with upcoming language changes; the feature will then be available for the rest of the interactive session.

For example, we had to use this in Chapter 14 to demonstrate generator functions, which require a keyword that is not yet enabled by default (they use a featurename of generators). We also used this statement to activate true division for numbers in Chapter 4.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Mixed Usage Modes: __name__ and __main__

Content preview·Buy reprint rights for this chapter

Here's a special module-related trick that lets you both import a file as a module, and run it as a standalone program. Each module has a built-in attribute called __name__, which Python sets automatically as follows:

If the file is being run as a top-level program file, __name__ is set to the string "__main__" when it starts.
If the file is being imported, __name__ is instead set to the module's name as known by its clients.

The upshot is that a module can test its own __name__ to determine whether it's being run or imported. For example, suppose we create the following module file, named runme.py, to export a single function called tester:

def tester(  ):
    print "It's Christmas in Heaven..."
if __name__ == '__main__':         # Only when run
    tester(  )                        # Not when imported

This module defines a function for clients to import and use as usual:

% python
>>> import runme
>>> runme.tester(  )
It's Christmas in Heaven...

But the module also includes code at the bottom that is set up to call the function when this file is run as a program:

% python runme.py
It's Christmas in Heaven...

Perhaps the most common place you'll see the __name__ test applied is for self-test code: you can package code that tests a module's exports in the module itself, by wrapping it in a __name__ test at the bottom. This way, you can use the file in clients by importing it, and test its logic by running it from the system shell or other launching schemes. Chapter 26 will discuss other commonly used options for testing Python code.

Another common role for the __name__ trick, is for writing files whose functionalty can be used as both a command-line utility, and a tool library. For instance, suppose you write a file finder script in Python; you can get more mileage out of your code, if you package your code in functions, and add a __name__ test in the file to automatically call those functions when the file is run standalone. That way, the script's code becomes reusable in other programs.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Changing the Module Search Path

Content preview·Buy reprint rights for this chapter

In Chapter 15, we mentioned that the module search path is a list of directories initialized from environment variable PYTHONPATH, and possibly .pth path files. What we haven't shown you until now is how a Python program can actually change the search path, by changing a built-in list called sys.path (the path attribute in the built-in sys module). sys.path is initialized on startup, but thereafter, you can delete, append, and reset its components however you like:

>>> import sys
>>> sys.path
['', 'D:\\PP2ECD-Partial\\Examples', 'C:\\Python22', ...more deleted...]
>>> sys.path = [r'd:\temp']                  # Change module search path
>>> sys.path.append('c:\\lp2e\\examples')    # for this process only.
>>> sys.path
['d:\\temp', 'c:\\lp2e\\examples']
>>> import string
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: No module named string

You can use this to dynamically configure a search path inside a Python program. Be careful: if you delete a critical directory from the path, you may lose access to critical utilities. In the last command in the example, we no longer have access to the string module, since we deleted the Python source library's directory from the path. Also remember that such settings only endure for the Python session or program that made them; they are not retained after Python exits.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The import as Extension

Content preview·Buy reprint rights for this chapter

Both the import and from statements have been extended to allow a module to be given a different name in your script:

import longmodulename as name

is equivalent to:

import longmodulename
name = longmodulename
del longmodulename          # Don't keep original name.

After the import, you can (and in fact must) use the name after the as to refer to the module. This works in a from statement too:

from module import longname as name

to assign the name from the file to a different name in your script. This extension is commonly used to provide short synonyms for longer names, and to avoid name clashes when you are already using a name in your script that would otherwise be overwritten by a normal import statement. This also comes in handy for providing a short, simple name for an entire directory path, when using the package import feature described in Chapter 17.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Design Concepts

Content preview·Buy reprint rights for this chapter

Like functions, modules present design tradeoffs: deciding which functions go in which module, module communication mechanisms, and so on. Here are a few general ideas that will become clearer when you start writing bigger Python systems:

You're always in a module in Python. There's no way to write code that doesn't live in some module. In fact, code typed at the interactive prompt really goes in a built-in module called __main__; the only unique things about the interactive prompt is that code runs and is disgarded immediately, and that expression results are printed.
Minimize module coupling: global variables. Like functions, modules work best if they're written to be closed boxes. As a rule of thumb, they should be as independent of global names in other modules as possible.
Maximize module cohesion: unified purpose. You can minimize a module's couplings by maximizing its cohesion; if all the components of a module share its general purpose, you're less likely to depend on external names.
Modules should rarely change other modules' variables. It's perfectly okay to use globals defined in another module (that's how clients import services), but changing globals in another module is often a symptom of a design problem. There are exceptions of course, but you should try to communicate results through devices such as function return values, not cross-module changes. Otherwise your globals' values become dependent on the order of arbitrarily remote assignments.

As a summary, Figure 18-1 sketches the environment in which modules operate. Modules contain variables, functions, classes, and other modules (if imported). Functions have local variables of their own. You'll meet classes—another object that lives within modules—in Chapter 19.

Figure 18-1: Module environment

Because modules expose most of their interesting properties as built-in attributes, it's easy to write programs that manage other programs. We usually call such manager programs

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Module Gotchas

Content preview·Buy reprint rights for this chapter

Here is the usual collection of boundary cases, which make life interesting for beginners. Some are so obscure it was hard to come up with examples, but most illustrate something important about Python.

The module name in an import or from statement is a hardcoded variable name. Sometimes, though, your program will get the name of a module to be imported as a string at runtime (e.g., if a user selects a module name from within a GUI). Unfortunately, you can't use import statements directly to load a module given its name as a string—Python expects a variable here, not a string. For instance:

>>> import "string"
  File "<stdin>", line 1
    import "string"
                  ^
SyntaxError: invalid syntax

It also won't work to put the string in a variable name:

x = "string"
import x

Here, Python will try to import a file x.py, not the string module.

To get around this, you need to use special tools to load modules dynamically from a string that exists at runtime. The most general approach is to construct an import statement as a string of Python code and pass it to the exec statement to run:

>>> modname = "string"
>>> exec "import " + modname       # Run a string of code.
>>> string                         # Imported in this namespace
<module 'string'>

The exec statement (and its cousin for expressions, the eval function) compiles a string of code, and passes it to the Python interpreter to be executed. In Python, the byte code compiler is available at runtime, so you can write programs that construct and run other programs like this. By default, exec runs the code in the current scope, but you can get more specific by passing in optional namespace dictionaries.

The only real drawback to exec is that it must compile the import statement each time it runs; if it runs many times, your code may run quicker if it uses the built-in __import__ function to load from a name string instead. The effect is similar, but __import__ returns the module object, so assign it to a name here to keep it:

>>> modname = "string"
>>> string = __import__(modname)
>>>

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part V Exercises

Content preview·Buy reprint rights for this chapter

See Section B.5 for the solutions.

Basics, import. Write a program that counts lines and characters in a file (similar in spirit to "wc" on Unix). With your text editor, code a Python module called mymod.py, which exports three top-level names:
- A countLines(name) function that reads an input file and counts the number of lines in it (hint: file.readlines( ) does most of the work for you, and len does the rest)
- A countChars(name) function that reads an input file and counts the number of characters in it (hint: file.read( ) returns a single string)
- A test(name) function that calls both counting functions with a given input filename. Such a filename generally might be passed-in, hardcoded, input with raw_input, or pulled from a command line via the sys.argv list; for now, assume it's a passed-in function argument.
All three mymod functions should expect a filename string to be passed in. If you type more than two or three lines per function, you're working much too hard—use the hints listed above!
Next, test your module interactively, using import and name qualification to fetch your exports. Does your PYTHONPATH need to include the directory where you created mymod.py? Try running your module on itself: e.g., test("mymod.py"). Note that test opens the file twice; if you're feeling ambitious, you may be able to improve this by passing an open file object into the two count functions (hint: file.seek(0) is a file rewind).
from/from*. Test your mymod module from Exercise 1 interactively, by using from to load the exports directly, first by name, then using the from* variant to fetch everything.
__main__. Add a line in your mymod module that calls the test function automatically only when the module is run as a script, not when it is imported. The line you add will probably test the value of __name__ for the string "__main__", as shown in this chapter. Try running your module from the system command line; then, import the module and test its functions interactively. Does it still work in both modes?

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 19: OOP: The Big Picture

Content preview·Buy reprint rights for this chapter

So far in this book, we've been using the term "object" generically. Really, the code written up to this point has been object-based—we've passed objects around, used them in expressions, called their methods, and so on. To qualify as being truly object-oriented (OO), though, objects generally need to also participate in something called an inheritance hierarchy.

This chapter begins the exploration of the Python class—a device used to implement new kinds of objects in Python. Classes are Python's main object-oriented programming (OOP) tool, so we'll also look at OOP basics along the way in this part of the book. In Python, classes are created with a new statement: the class. As we'll see, the objects defined with classes can look a lot like the built-in types we saw earlier in the book. They will also support inheritance—a mechanism of code customization and reuse, above and beyond anything we've seen so far.

One note up front: Python OOP is entirely optional, and you don't need to use classes just to get started. In fact, you can get plenty of work done with simpler constructs such as functions, or even simple top-level script code. But classes turn out to be one of the most useful tools Python provides, and we will show you why here. They're also employed in popular Python tools like the Tkinter GUI API, so most Python programmers will usually find at least a working knowledge of class basics helpful.

Remember when we told you that programs do things with stuff? In simple terms, classes are just a way to define new sorts of stuff, which reflect real objects in your program's domain. For instance, suppose we've decided to implement that hypothetical pizza-making robot we used as an example in Chapter 12. If we implement it using classes, we can model more of its real-world structure and relationships:

Inheritance: Pizza-making robots are a kind of robot, and so possess the usual robot-y properties. In OOP terms, we say they inherit properties from the general category of all robots. These common properties need to be implemented only once for the general case and reused by all types of robots we may build in the future.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Classes?

Content preview·Buy reprint rights for this chapter

Inheritance: Pizza-making robots are a kind of robot, and so possess the usual robot-y properties. In OOP terms, we say they inherit properties from the general category of all robots. These common properties need to be implemented only once for the general case and reused by all types of robots we may build in the future.
Composition: Pizza-making robots are really collections of components that work together as a team. For instance, for our robot to be successful, it might need arms to roll dough, motors to maneuver to the oven, and so on. In OOP parlance, our robot is an example of composition; it contains other objects it activates to do its bidding. Each component might be coded as a class, which defines its own behavior and relationships.

General OOP ideas like inheritance and composition apply to any application that can be decomposed into a set of objects. For example, in typical GUI systems, interfaces are written as collections of widgets—buttons, labels, and so on—which are all drawn when their container is drawn (composition). Moreover, we may be able to write our own custom widgets—buttons with unique fonts, labels with new color schemes, and the like—which are specialized versions of more general interface devices (inheritance).

From a more concrete programming perspective, classes are a Python program unit, just like functions and modules. They are another compartment for packaging logic and data. In fact, classes also define a new namespace much like modules. But compared to other program units we've already seen, classes have three critical distinctions that make them more useful when it comes to building new objects:

Multiple instances: Classes are roughly factories for generating one or more objects. Every time we call a class, we generate a new object, with a distinct namespace. Each object generated from a class has access to the class's attributes and gets a namespace of its own for data that varies per object.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

OOP from 30,000 Feet

Content preview·Buy reprint rights for this chapter

Before we show what this all means in terms of code, we'd like to say a few words about the general ideas behind OOP here. If you've never done anything object-oriented in your life before now, some of the words we'll be using in this chapter may seem a bit perplexing on the first pass. Moreover, the motivation for using such words may be elusive, until you've had a chance to study the ways that programmers apply them in larger systems. OOP is as much an experience as a technology.

The good news is that OOP is much simpler to understand and use in Python than in other languages such as C++ or Java. As a dynamically-typed scripting language, Python removes much of the syntactic clutter and complexity that clouds OOP in other tools. In fact, most of the OOP story in Python boils down to this expression:

                  object.attribute

We've been using this all along in the book so far, to access module attributes, call methods of objects, and so on. When we say this to an object that is derived from a class statement, the expression kicks off a search in Python—it searches a tree of linked objects, for the first appearance of the attribute that it can find. In fact, when classes are involved, the Python expression above translates to the following in natural language:

Find the first occurrence of attribute by looking in object,and all classes above it, from bottom to top and left to right.

In other words, attribute fetches are simply tree searches. We call this search procedure inheritance, because objects lower in a tree inherit attributes attached to objects higher in a tree, just because the attribute search proceeds from bottom to top in the tree. In a sense, the automatic search performed by inheritance means that objects linked into a tree are the union of all the attributes defined in all their tree parents, all the way up the tree.

In Python, this is all very literal: we really do build up trees of linked objects with code, and Python really does climb this tree at runtime searching for attributes, every time we say

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 20: Class Coding Basics

Content preview·Buy reprint rights for this chapter

Now that we've talked about OOP in the abstract, let's move on to the details of how this translates to actual code. In this chapter and in Chapter 21, we fill in the syntax details behind the class model in Python.

If you've never been exposed to OOP in the past, classes can be somewhat complicated if taken in a single dose. To make class coding easier to absorb, we'll begin our detailed look at OOP by taking a first look at classes in action in this chapter. We'll expand on the details introduced here in later chapters of this part of the book; but in their basic form, Python classes are easy to understand.

Classes have three primary distinctions. At a base level, they are mostly just namespaces, much like the modules studied in Part V. But unlike modules, classes also have support for generating multiple objects, namespace inheritance, and operator overloading. Let's begin our class statement tour by exploring each of these three distinctions in turn.

To understand how the multiple objects idea works, you have to first understand that there are two kinds of objects in Python's OOP model—class objects and instance objects. Class objects provide default behavior and serve as factories for instance objects. Instance objects are the real objects your programs process; each is a namespace in its own right, but inherits (i.e., has automatic access to) names in the class it was created from. Class objects come from statements, and instances from calls; each time you call a class, you get a new instance of that class.

This object generation concept is very different from any of the other program constructs we've seen so far in this book. In effect, classes are factories for making many instances. By contrast, there is only one copy of each module imported (in fact, this is one reason that we have to call reload, to update the single module object).

Next, we'll summarize the bare essentials of Python OOP. Classes are in some ways similar to both def and modules, but they may be quite different than what you're used to in other languages.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes Generate Multiple Instance Objects

Content preview·Buy reprint rights for this chapter

Next, we'll summarize the bare essentials of Python OOP. Classes are in some ways similar to both def and modules, but they may be quite different than what you're used to in other languages.

The class statement creates a class object and assigns it a name. Just like the function def statement, the Python class statement is an executable statement. When reached and run, it generates a new class object and assigns it to the name in the class header. Also like def, class statements typically run when the file they are coded in is first imported.
Assignments inside class statements make class attributes. Just like module files, assignments within a class statement generate attributes in a class object. After running a class statement, class attributes are accessed by name qualification: object.name.
Class attributes provide object state and behavior. Attributes of a class object record state information and behavior, to be shared by all instances created from the class; function def statements nested inside a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes Are Customized by Inheritance

Content preview·Buy reprint rights for this chapter

Besides serving as object generators, classes also allow us to make changes by introducing new components (called subclasses), instead of changing existing components in place. Instance objects generated from a class inherit the class's attributes. Python also allows classes to inherit from other classes, and this opens the door to coding hierarchies of classes, that specialize behavior by overriding attributes lower in the hierarchy. Here, too, there is no parallel in modules: their attributes live in a single, flat namespace.

In Python, instances inherit from classes, and classes inherit from superclasses. Here are the key ideas behind the machinery of attribute inheritance:

S uperclasses are listed in parentheses in a class header. To inherit attributes from another class, just list the class in parentheses in a class statement's header. The class that inherits is called a subclass, and the class that is inherited from is its superclass.
Classes inherit attributes from their superclasses. Just like instances, a class gets all the attribute names defined in its superclasses; they're found by Python automatically when accessed, if they don't exist in the subclass.
Instances inherit attributes from all accessible classes. Instances get names from the class they are generated from, as well as all of that class's superclasses. When looking for a name, Python checks the instance, then its class, then all superclasses above.
Each object.attribute reference invokes a new, independent search. Python performs an independent search of the class tree, for each attribute fetch expression. This includes both references to instances and classes made outside class statements (e.g., X.attr), as well as references to attributes of the self instance argument in class method functions. Each self.attr in a method invokes a new search for attr in self and above.
Logic changes are made by subclassing, not by changing superclasses.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes Can Intercept Python Operators

Content preview·Buy reprint rights for this chapter

Let's take a look at the third major distinction of classes: operator overloading. In simple terms, operator overloading lets objects coded with classes intercept and respond to operations that work on built-in types: addition, slicing, printing, qualification, and so on. It's mostly just an automatic dispatch mechanism: expressions route control to implementations in classes. Here, too, there is nothing similar in modules: modules can implement function calls, but not the behavior of expressions.

Although we could implement all class behavior as method functions, operator overloading lets objects be more tightly integrated with Python's object model. Moreover, because operator overloading makes our own objects act like built-ins, it tends to foster object interfaces that are more consistent and easier to learn. Here are the main ideas behind overloading operators:

Methods with names such as __X__ are special hooks. Python operator overloading is implemented by providing specially named methods to intercept operations.
Such methods are called automatically when Python evaluates operators. For instance, if an object inherits an __add__ method, it is called when the object appears in a + expression.
Classes may override most built-in type operations. There are dozens of special operator method names, for intercepting and implementing nearly every operation available for built-in types.
Operators allow classes to integrate with Python's object model. By overloading type operations, user-defined objects implemented with classes act just like built-ins, and so provide consistency.

On to another example. This time, we define a subclass of SecondClass, which implements three specially-named attributes that Python will call automatically: __init__ is called when a new instance object is being constructed (self is the new ThirdClass object), and __add__ and __mul__ are called when a ThirdClass instance appears in

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 21: Class Coding Details

Content preview·Buy reprint rights for this chapter

Did all of Chapter 20 make sense? If not, don't worry; now that we've had a quick tour, we're going to dig a bit deeper and study the concepts we've introduced in further detail. This chapter takes a second pass, to formalize and expand on some of the class coding ideas introduced in Chapter 20.

Although the Python class statement seems similar to other OOP languages on the surface, on closer inspection it is quite different than what some programmers are used to. For example, as in C++, the class statement is Python's main OOP tool. Unlike C++, Python's class is not a declaration. Like def, class is an object builder, and an implicit assignment—when run, it generates a class object, and stores a reference to it in the name used in the header. Also like def, class is true executable code—your class doesn't exist until Python reaches and runs the class statement (typically, while importing the module it is coded in, but not until).

class is a compound statement with a body of indented statements usually under it. In the header, superclasses are listed in parentheses after the class name, separated by commas. Listing more than one superclass leads to multiple inheritance (which we'll say more about in the next chapter). Here is the statement's general form:

class <name>(superclass,...):       # Assign to name.
    data = value                    # Shared class data
    def method(self,...):           # Methods
        self.member = value         # Per-instance data

Within the class statement, any assignment generates a class attribute, and specially-named methods overload operators; for instance, a function called __init__ is called at instance object construction time, if defined.

Classes are mostly just namespaces—a tool for defining names (i.e., attributes) that export data and logic to clients. So how do you get from the class statement to a namespace?

Here's how. Just as with modules files, the statements nested in a class statement body create its attributes. When Python executes a class statement (not a call to a class), it runs all the statements in its body, from top to bottom. Assignments that happen during this process create names in the class's local scope, which become attributes in the associated class object. Because of this, classes resemble both modules and functions:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Class Statement

Content preview·Buy reprint rights for this chapter

class <name>(superclass,...):       # Assign to name.
    data = value                    # Shared class data
    def method(self,...):           # Methods
        self.member = value         # Per-instance data

Classes are mostly just namespaces—a tool for defining names (i.e., attributes) that export data and logic to clients. So how do you get from the class statement to a namespace?

Like functions, class statements are a local scope where names created by nested assignments live.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Methods

Content preview·Buy reprint rights for this chapter

Since you already know about functions, you also know about methods in classes. Methods are just function objects created by def statements nested in a class statement's body. From an abstract perspective, methods provide behavior for instance objects to inherit. From a programming perspective, methods work in exactly the same way as simple functions, with one crucial exception: their first argument always receives the instance object that is the implied subject of a method call.

In other words, Python automatically maps instance method calls to class method functions as follows. Method calls made through an instance:

instance.method(args...)

are automatically translated to class method function calls of this form:

class.method(instance, args...)

where the class is determined by locating the method name using Python's inheritance search procedure. In fact, both call forms are valid in Python.

Beside the normal inheritance of method attribute names, the special first argument is the only real magic behind method calls. In a class method, the first argument is usually called self by convention (technically, only its position is significant, not its name). This argument provides methods with a hook back to the instance—because classes generate many instance objects, they need to use this argument to manage data that varies per instance.

C++ programmers may recognize Python's self argument as similar to C++'s "this" pointer. In Python, though, self is always explicit in your code. Methods must always go through self to fetch or change attributes of the instance being processed by the current method call. This explicit nature of self is by design—the presence of this name makes it obvious that you are using attribute names in your script, not a name in the local or global scope.

Let's turn to an example; suppose we define the following class:

class NextClass:                            # Define class.
    def printer(self, text):                # Define method.
        self.message = text                 # Change instance.
        print self.message                  # Access instance.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Inheritance

Content preview·Buy reprint rights for this chapter

The whole point of a namespace tool like the class statement is to support name inheritance. This section expands on some of the mechanisms and roles of attribute inheritance.

In Python, inheritance happens when an object is qualified, and involves searching an attribute definition tree (one or more namespaces). Every time you use an expression of the form object.attr where object is an instance or class object, Python searches the namespace tree at and above object, for the first attr it can find. This includes references to self attributes in your methods. Because lower definitions in the tree override higher ones, inheritance forms the basis of specialization.

Figure 21-1 summarizes the way namespace trees are constructed and populated with names. Generally:

Instance attributes are generated by assignments to self attributes in methods.
Class attributes are created by statements (assignments) in class statements.
Superclass links are made by listing classes in parentheses in a class statement header.

Figure 21-1: Namespaces tree construction and inheritance

The net result is a tree of attribute namespaces, which grows from an instance, to the class it was generated from, to all the superclasses listed in the class headers. Python searches upward in this tree from instances to superclasses, each time you use qualification to fetch an attribute name from an instance object.

The tree-searching model of inheritance just described turns out to be a great way to specialize systems. Because inheritance finds names in subclasses before it checks superclasses, subclasses can replace default behavior by redefining the superclass's attributes. In fact, you can build entire systems as hierarchies of classes, which are extended by adding new external subclasses rather than changing existing logic in place.

The idea of redefining inherited names leads to a variety of specialization techniques. For instance, subclasses may replace inherited attributes completely, provide attributes that a superclass expects to find, and

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Operator Overloading

Content preview·Buy reprint rights for this chapter

We introduced operator overloading in the prior chapter; let's fill in more details here and look at a few commonly used overloading methods. Here's a review of the key ideas behind overloading:

Operator overloading lets classes intercept normal Python operations.
Classes can overload all Python expression operators.
Classes can also overload operations: printing, calls, qualification, etc.
Overloading makes class instances act more like built-in types.
Overloading is implemented by providing specially named class methods.

Here's a simple example of overloading at work. When we provide specially named methods in a class, Python automatically calls them when instances of the class appear in the associated operation. For instance, the Number class in file number.py below provides a method to intercept instance construction (__init__), as well as one for catching subtraction expressions (__sub__). Special methods are the hook that lets you tie into built-in operations:

class Number:
    def __init__(self, start):               # On Number(start)
        self.data = start
    def __sub__(self, other):                # On instance - other
        return Number(self.data - other)    # result is a new instance
>>> from number import Number               # Fetch class from module.
>>> X = Number(5)                           # Number.__init__(X, 5)
>>> Y = X - 2                               # Number.__sub__(X, 2)
>>> Y.data                                  # Y is new Number instance.
3

Just about everything you can do to built-in objects such as integers and lists has a corresponding specially named method for overloading in classes. Table 21-1 lists a few of the most common; there are many more. In fact, many overload methods come in multiple versions (e.g., __add__, __radd__, and __iadd__ for addition). See other Python books or the Python Language Reference Manual for an exhaustive list of special method names available.

Table 21-1: Common operator overloading methods

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Namespaces: The Whole Story

Content preview·Buy reprint rights for this chapter

Now that we've seen class and instance objects, the Python namespace story is complete; for reference, let's quickly summarize all the rules used to resolve names. The first things you need to remember are that qualified and unqualified names are treated differently, and that some scopes serve to initialize object namespaces:

Unqualified names (e.g., X) deal with scopes.
Qualified attribute names (e.g., object.X) use object namespaces.
Some scopes initialize object namespaces (modules and classes).

Unqualified names follow the LEGB lexical scoping rules outlined for functions in Chapter 13:

Assignment: X = value: Makes names local: creates or changes name X in the current local scope, unless declared global
Reference: X: Looks for name X in the current local scope, then any and all enclosing functions, then the current global scope, then the built-in scope

Q ualified names refer to attributes of specific objects and obey the rules for modules and classes. For class and class instance objects, the reference rules are augmented to include the inheritance search procedure:

Assignment: object.X = value: Creates or alters the attribute name X in the namespace of the object being qualified, and no other. Inheritance tree climbing only happens on attribute reference, not on attribute assignment.
Reference: object.X: For class-based objects, searches for the attribute name X in the object, then in all accessible classes above it, using the inheritance search procedure. For non-class objects such as modules, fetches X from object directly.

With distinct search procedures for qualified and unqualified names, and multiple lookup layers for both, it can sometimes be confusing to know where a name will wind up going. In Python, the place where you assign a name is crucial—it fully determines which scope or which object a name will reside in. File manynames.py illustrates and summarizes how this translates to code:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 22: Designing with Classes

Content preview·Buy reprint rights for this chapter

So far, we've concentrated on the OOP tool in Python—the class. But OOP is also about design issues—how to use classes to model useful objects. This section will touch on a few OOP core ideas, and look at some additional examples that are more realistic than the examples shown so far. Many of the design terms mentioned here require more explanation than we can provide; if this section sparks your curiosity, we suggest exploring a text on OOP design or design patterns as a next step.

Python's implementation of OOP can be summarized by three ideas:

Inheritance: Is based on attribute lookup in Python (in X.name expressions).
Polymorphism: In X.method, the meaning of method depends on the type (class) of X.
Encapsulation: Methods and operators implement behavior; data hiding is a convention by default.

By now, you should have a good feel for what inheritance is all about in Python. We've talked about Python's polymorphism a few times already; it flows from Python's lack of type declarations. Because attributes are always resolved at runtime, objects that implement the same interfaces are interchangeable. Clients don't need to know what sort of object is implementing a method they call.

Encapsulation means packaging in Python—hiding implementation details behind an object's interface; it does not mean enforced privacy, as you'll see in Chapter 23. Encapsulation allows the implementation of an object's interface to be changed, without impacting the users of that object.

Some OOP languages also define polymorphism to mean overloading functions based on the type signatures of their arguments. Since there is no type declaration in Python, the concept doesn't really apply; polymorphism in Python is based on object interfaces, not types. For example, you can try to overload methods by their argument lists:

class C:
    def meth(self, x):
        ...
    def meth(self, x, y, z):
        ...

This code will run, but because the def simply assigns an object to a name in the class's scope, the last definition of a method function is the only one retained (it's just as if you say

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Python and OOP

Content preview·Buy reprint rights for this chapter

Python's implementation of OOP can be summarized by three ideas:

Inheritance: Is based on attribute lookup in Python (in X.name expressions).
Polymorphism: In X.method, the meaning of method depends on the type (class) of X.
Encapsulation: Methods and operators implement behavior; data hiding is a convention by default.

class C:
    def meth(self, x):
        ...
    def meth(self, x, y, z):
        ...

This code will run, but because the def simply assigns an object to a name in the class's scope, the last definition of a method function is the only one retained (it's just as if you say X=1, and then X=2; X will be 2).

Type-based selections can always be coded using the type testing ideas we met in Chapter 7, or the argument list tools in Chapter 13:

class C:
    def meth(self, *args):
        if len(args) == 1:
            ...
        elif type(arg[0]) == int:
            ...

You normally shouldn't do this, though—as described in Chapter 12, write your code to expect an object interface, not a specific datatype. That way, it becomes useful for a broader category of types and applications, now and in the future:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes as Records

Content preview·Buy reprint rights for this chapter

Chapter 6 showed you how to use dictionaries to record properties of entities in your program. Let's explore this in more detail. Here is the example for dictionary-based records used earlier:

>>> rec = {  }
>>> rec['name'] = 'mel'
>>> rec['age']  = 40
>>> rec['job']  = 'trainer/writer'
>>>
>>> print rec['name']
mel

This code emulates things like "records" and "structs" in other languages. It turns out that there are multiple ways to do the same with classes. Perhaps the simplest is this:

>>> class rec: pass
...
>>> rec.name = 'mel'
>>> rec.age  = 40
>>> rec.job  = 'trainer/writer'
>>>
>>> print rec.age
40

This code has substantially less syntax than the dictionary equivalent. It uses an empty class statement to generate an empty namespace object (notice the pass statement—we need a statement syntactically even though there is no logic to code in this case). Once we make the empty class, we fill it out by assigning to class attributes over time.

This works, but we'll need a new class statement for each distinct record we will need. Perhaps more typically, we can instead generate instances of an empty class to represent each distinct entity:

>>> class rec: pass
...
>>> pers1 = rec(  )
>>> pers1.name = 'mel'
>>> pers1.job  = 'trainer'
>>> pers1.age   = 40
>>>
>>> pers2 = rec(  )
>>> pers2.name = 'dave'
>>> pers2.job  = 'developer'
>>>
>>> pers1.name, pers2.name
('mel', 'dave')

Here, we make two records from the same class; instances start out life empty, just like classes. We fill in the record by assigning to attributes. This time, though, there are two separate objects, and hence two separate name attributes. In fact, instances of the same class don't even have to have the same set of attribute names; in this example, one has a unique age name. Instances really are distinct namespaces—each has a distinct attribute dictionary. Although they are normally filled out consistently by class methods, they are more flexible than you might expect.

Finally, you might instead code a more full-blown class to implement your record:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

OOP and Inheritance: "is-a" Relationships

Content preview·Buy reprint rights for this chapter

We've talked about the mechanics of inheritance in depth already, but we'd like to show you an example of how it can be used to model real-world relationships. From a programmer's point of view, inheritance is kicked off by attribute qualifications, and triggers a search for a name in an instance, its class, and then its superclasses. From a designer's point of view, inheritance is a way to specify set membership: a class defines a set of properties that may be inherited and customized by more specific sets (i.e., subclasses).

To illustrate, let's put that pizza-making robot we talked about at the start of this part of the book to work. Suppose we've decided to explore alternative career paths and open a pizza restaurant. One of the first things we'll need to do is hire employees to serve customers, make the pizza, and so on. Being engineers at heart, we've also decided to build a robot to make the pizzas; but being politically and cybernetically correct, we've also decided to make our robot a full-fledged employee, with a salary.

Our pizza shop team can be defined by the following classes in the example file employees.py. It defines four classes and some self-test code. The most general class, Employee, provides common behavior such as bumping up salaries (giveRaise) and printing (_ _repr__). There are two kinds of employees, and so two subclasses of Employee—Chef and Server. Both override the inherited work method to print more specific messages. Finally, our pizza robot is modeled by an even more specific class: PizzaRobot is a kind of Chef, which is a kind of Employee. In OOP terms, we call these relationships "is-a" links: a robot is a chef, which is a(n) employee.

class Employee:
    def __init__(self, name, salary=0):
        self.name   = name
        self.salary = salary
    def giveRaise(self, percent):
        self.salary = self.salary + (self.salary * percent)
    def work(self):
        print self.name, "does stuff"
    def __repr__(self):
        return "<Employee: name=%s, salary=%s>" % (self.name, self.salary)
class Chef(Employee):
    def __init__(self, name):
        Employee.__init__(self, name, 50000)
    def work(self):
        print self.name, "makes food"
class Server(Employee):
    def __init__(self, name):
        Employee.__init__(self, name, 40000)
    def work(self):
        print self.name, "interfaces with customer"
class PizzaRobot(Chef):
    def __init__(self, name):
        Chef.__init__(self, name)
    def work(self):
        print self.name, "makes pizza"
if __name__ == "__main__":
    bob = PizzaRobot('bob')       # Make a robot named bob.
    print bob                     # Runs inherited __repr__
    bob.work(  )                       # Run type-specific action.
    bob.giveRaise(0.20)           # Give bob a 20% raise.
    print bob; print
    for klass in Employee, Chef, Server, PizzaRobot:
        obj = klass(klass.__name__)
        obj.work(  )

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

OOP and Composition: "has-a" Relationships

Content preview·Buy reprint rights for this chapter

We introduced the notion of composition in Chapter 19. From a programmer's point of view, composition involves embedding other objects in a container object and activating them to implement container methods. To a designer, composition is another way to represent relationships in a problem domain. But rather than set membership, composition has to do with components—parts of a whole. Composition also reflects the relationships between parts; it's usually called a "has-a" relationship. Some OO design texts refer to composition as aggregation (or distinguish between the two terms by using aggregation for a weaker dependency between container and contained); in this text, "composition" simply refers to a collection of embedded objects.

Now that we've implemented our employees, let's put them in the pizza shop and let them get busy. Our pizza shop is a composite object; it has an oven, and employees like servers and chefs. When a customer enters and places an order, the components of the shop spring into action—the server takes an order, the chef makes the pizza, and so on. The following example, file pizzashop.py, simulates all the objects and relationships in this scenario:

from employees import PizzaRobot, Server
class Customer:
    def __init__(self, name):
        self.name = name
    def order(self, server):
        print self.name, "orders from", server
    def pay(self, server):
        print self.name, "pays for item to", server
class Oven:
    def bake(self):
        print "oven bakes"
class PizzaShop:
    def __init__(self):
        self.server = Server('Pat')         # Embed other objects.
        self.chef   = PizzaRobot('Bob')     # A robot named bob
        self.oven   = Oven(  )
    def order(self, name):
        customer = Customer(name)           # Activate other objects.
        customer.order(self.server)         # Customer orders from server.
        self.chef.work(  )
        self.oven.bake(  )
        customer.pay(self.server)
if __name__ == "__main__":
    scene = PizzaShop(  )                        # Make the composite.
    scene.order('Homer')                    # Simulate Homer's order.
    print '...'
    scene.order('Shaggy')                   # Simulate Shaggy's order.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

OOP and Delegation

Content preview·Buy reprint rights for this chapter

O bject-oriented programmers often talk about something called delegation, which usually implies controller objects that embed other objects, to which they pass off operation requests. The controllers can take care of administrative activities such as keeping track of accesses and so on. In Python, delegation is often implemented with the __getattr__ method hook; because it intercepts accesses to nonexistent attributes, a wrapper class can use __getattr__ to route arbitrary accesses to a wrapped object. Consider file trace.py, for instance:

class wrapper:
    def __init__(self, object):
        self.wrapped = object                        # Save object.
    def __getattr__(self, attrname):
        print 'Trace:', attrname                     # Trace fetch.
        return getattr(self.wrapped, attrname)       # Delegate fetch.

Recall that __getattr__ gets the attribute name as a string. This code makes use of the getattr built-in function to fetch an attribute from the wrapped object by name string—getattr(X,N) is like X.N, except that N is an expression that evaluates to a string at runtime, not a variable. In fact, getattr(X,N) is similar to X.__dict__[N], but the former also performs inheritance search like X.N (see Section 21.5.4).

You can use the approach of this module's wrapper class to manage access to any object with attributes—lists, dictionaries, and even classes and instances. Here, the class simply prints a trace message on each attribute access, and delegates the attribute request to the embedded wrapped object:

>>> from trace import wrapper
>>> x = wrapper([1,2,3])              # Wrap a list.
>>> x.append(4)                       # Delegate to list method.
Trace: append
>>> x.wrapped                         # Print my member.
[1, 2, 3, 4]
>>> x = wrapper({"a": 1, "b": 2})     # Wrap a dictionary.
>>> x.keys(  )                             # Delegate to dictionary method.
Trace: keys
['a', 'b']

We'll revive the notions of wrapped object and delegated operations as one way to extend built-in types in Chapter 23.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Multiple Inheritance

Content preview·Buy reprint rights for this chapter

In the class statement, more than one superclass can be listed in parentheses in the header line. When you do this, you use something called multiple inheritance—the class and its instances inherit names from all listed superclasses.

When searching for an attribute, Python searches superclasses in the class header from left to right until a match is found. Technically, the search proceeds depth-first all the way to the top, and then left to right, since any of the superclasses may have superclasses of its own.

In general, multiple inheritance is good for modeling objects that belong to more than one set. For instance, a person may be an engineer, a writer, a musician, and so on, and inherit properties from all such sets.

Perhaps the most common way multiple inheritance is used is to "mix in" general-purpose methods from superclasses. Such superclasses are usually called mixin classes—they provide methods you add to application classes by inheritance. For instance, Python's default way to print a class instance object isn't incredibly useful:

>>> class Spam:
...     def __init__(self):               # No __repr__
...         self.data1 = "food"
...
>>> X = Spam(  )
>>> print X                                   # Default: class, address
<__main__.Spam instance at 0x00864818>

As seen in the previous section on operator overloading, you can provide a __repr__ method to implement a custom string representation of your own. But rather than code a __repr__ in each and every class you wish to print, why not code it once in a general-purpose tool class, and inherit it in all your classes?

That's what mixins are for. The following code, file mytools.py, defines a mixin class called Lister that overloads the __repr__ method for each class that includes Lister in its header line. It simply scans the instance's attribute dictionary (remember, it's exported in __dict__) to build up a string showing the names and values of all instance attributes. Since classes are objects, Lister's formatting logic can be used for instances of any subclass; it's a generic tool.

Lister uses two special tricks to extract the instance's classname and address. Instances have a built-in

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes Are Objects: Generic Object Factories

Content preview·Buy reprint rights for this chapter

Because classes are objects, it's easy to pass them around a program, store them in data structures, and so on. You can also pass classes to functions that generate arbitrary kinds of objects; such functions are sometimes called factories in OOP design circles. They are a major undertaking in a strongly typed language such as C++, but almost trivial in Python: the apply function and syntax we met in Chapter 14 can call any class with any number of constructor arguments in one step, to generate any sort of instance:

def factory(aClass, *args):                 # varargs tuple
    return apply(aClass, args)              # Call aClass.
class Spam:
    def doit(self, message):
        print message
class Person:
    def __init__(self, name, job):
        self.name = name
        self.job  = job
object1 = factory(Spam)                      # Make a Spam.
object2 = factory(Person, "Guido", "guru")   # Make a Person.

In this code, we define an object generator function, called factory. It expects to be passed a class object (any class will do), along with one or more arguments for the class's constructor. The function uses apply to call the function and return an instance.

The rest of the example simply defines two classes and generates instances of both by passing them to the factory function. And that's the only factory function you ever need write in Python; it works for any class and any constructor arguments. One possible improvement worth noting: to support keyword arguments in constructor calls, the factory can collect them with a **args argument and pass them as a third argument to apply:

def factory(aClass, *args, **kwargs):        # +kwargs dict
    return apply(aClass, args, kwargs)       # Call aClass.

By now, you should know that everything is an "object" in Python; even things like classes, which are just compiler input in languages like C++. However, as mentioned at the start of Part VI, only objects derived from classes are OOP objects in Python.

So what good is the factory function (besides giving us an excuse to illustrate class objects in this book)? Unfortunately, it's difficult to show you applications of this design pattern, without listing much more code than we have space for here. In general, though, such a factory might allow code to be insulated from the details of dyamically-configured object construction.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Methods Are Objects: Bound or Unbound

Content preview·Buy reprint rights for this chapter

Methods are a kind of object, much like functions. Class methods can be accessed from either an instance or a class; because of this, they actually come in two flavors in Python:

Unbound class method objects: no self: Accessing a class's function attribute by qualifying a class returns an unbound method object. To call it, you must provide an instance object explicitly as its first argument.
Bound instance method objects: self + function pairs: Accessing a class's function attribute by qualifying an instance returns a bound method object. Python automatically packages the instance with the function in the bound method object, so we don't need to pass an instance to call the method.

Both kinds of methods are full-fledged objects; they can be passed around, stored in lists, and so on. Both also require an instance in their first argument when run (i.e., a value for self). This is why we've had to pass in an instance explictly when calling superclass methods from subclass methods in the previous chapter; technically, such calls produce unbound method objects.

When calling a bound method object, Python provides an instance for you automatically—the instance used to create the bound method object. This means that bound method objects are usually interchangeable with simple function objects, and makes them especially useful for interfaces written originally for functions (see the sidebar on callbacks for a realistic example).

To illustrate, suppose we define the following class:

class Spam:
    def doit(self, message):
        print message

Now, in normal operation, we make an instance, and call its method in a single step to print the passed argument:

object1 = Spam(  )
object1.doit('hello world')

Really, though, a bound method object is generated along the way—just before the method call's parenthesis. In fact, we can fetch a bound method without actually calling it. An object.name qualification is an object expression. In the following, it returns a bound method object that packages the instance (object1) with the method function (Spam.doit). We can assign the bound method to another name, and call it as though it were a simple function:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Documentation Strings Revisited

Content preview·Buy reprint rights for this chapter

Chapter 11 covered docstrings in detail in our look at documentation sources and tools. Docstrings are string literals that show up at the top of various structures, and are saved by Python automatically in object __doc__ attributes. This works for module files, function defs, and classes and methods. Now that we know more about classes and methods, file docstr.py provides a quick but comprehensive example that summarizes the places where docstrings can show up in your code; all can be triple-quoted blocks:

"I am: docstr.__doc__"
class spam:
    "I am: spam.__doc__ or docstr.spam.__doc__"
    def method(self, arg):
        "I am: spam.method.__doc__ or self.method.__doc__"
        pass
def func(args):
    "I am: docstr.func.__doc__"
    pass

The main advantage of documentation strings is that they stick around at runtime; if it's been coded as a documentation string, you can qualify an object to fetch its documentation.

>>> import docstr
>>> docstr.__doc__
'I am: docstr.__doc__'
>>> docstr.spam.__doc__
'I am: spam.__doc__ or docstr.spam.__doc__'
>>> docstr.spam.method.__doc__
'I am: spam.method.__doc__ or self.method.__doc__'
>>> docstr.func.__doc__
'I am: docstr.func.__doc__'

The discussion of the PyDoc tool that knows how to format all these strings in reports appears in Chapter 11. Documentation strings are available at runtime, but they are also less flexible syntactically than # comments (which can appear anywhere in a program). Both forms are useful tools, and any program documentation is good (as long as it's accurate).

Because bound methods automatically pair an instance with a class method function, you can use them in any place that a simple function is expected. One of the most common places you'll see this idea put to work is in code that registers methods as event callback handlers in the Tkinter GUI interface. Here's the simple case:

def handler(  ):
    ...use globals for state...
...
widget = Button(text='spam', command=handler)

To register a handler for button click events, we usually pass a callable object that takes no arguments to the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Classes Versus Modules

Content preview·Buy reprint rights for this chapter

Finally, let's wrap up this chapter by comparing the topics of this book's last two parts—modules and classes. Since they're both about namespaces, the distinction can sometimes be confusing. In short:

Modules

Are data/logic packages
Are created by writing Python files or C extensions
Are used by being imported

Classes

Implement new objects
Are created by class statements
Are used by being called
Always live within a module

Classes also support extra features modules don't, such as operator overloading, multiple instance generation, and inheritance. Although both are namespaces, we hope you can tell by now that they are very different things.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 23: Advanced Class Topics

Content preview·Buy reprint rights for this chapter

Part VI concludes our look at OOP in Python by presenting a few more advanced class-related topics, along with the gotchas and exercises for this part of the book. We encourage you to do the exercises, to help cement the ideas we've studied. We also suggest working on or studying larger OOP Python projects as a supplement to this book. Like much in computing, the benefits of OOP tend to become more apparent with practice.

Besides implementing new kinds of objects, classes are sometimes used to extend the functionality of Python's built-in types in order to support more exotic data structures. For instance, to add queue insert and delete methods to lists, you can code classes that wrap (embed) a list object, and export insert and delete methods that process the list specially, like the delegation technique studied in Chapter 22. As of Python 2.2, you can also use inheritance to specialize built-in types. The next two sections show both techniques in action.

Remember those set functions we wrote in Part IV? Here's what they look like brought back to life as a Python class. The following example, file setwrapper.py, implements a new set object type, by moving some of the set functions to methods, and adding some basic operator overloading. For the most part, this class just wraps a Python list with extra set operations. Because it's a class, it also supports multiple instances and customization by inheritance in subclasses.

class Set:
   def __init__(self, value = [  ]):     # Constructor
       self.data = [  ]                 # Manages a list
       self.concat(value)
   def intersect(self, other):        # other is any sequence.
       res = [  ]                       # self is the subject.
       for x in self.data:
           if x in other:             # Pick common items.
               res.append(x)
       return Set(res)                # Return a new Set.
   def union(self, other):            # other is any sequence.
       res = self.data[:]             # Copy of my list
       for x in other:                # Add items in other.
           if not x in res:
               res.append(x)
       return Set(res)
   def concat(self, value):           # value: list, Set...
       for x in value:                # Removes duplicates
          if not x in self.data:
               self.data.append(x)
   def __len__(self):          return len(self.data)        # len(self)
   def __getitem__(self, key): return self.data[key]        # self[i]
   def __and__(self, other):   return self.intersect(other) # self & other
   def __or__(self, other):    return self.union(other)     # self | other
   def __repr__(self):         return 'Set:' + `self.data`  # Print

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Extending Built-in Types

Content preview·Buy reprint rights for this chapter

class Set:
   def __init__(self, value = [  ]):     # Constructor
       self.data = [  ]                 # Manages a list
       self.concat(value)
   def intersect(self, other):        # other is any sequence.
       res = [  ]                       # self is the subject.
       for x in self.data:
           if x in other:             # Pick common items.
               res.append(x)
       return Set(res)                # Return a new Set.
   def union(self, other):            # other is any sequence.
       res = self.data[:]             # Copy of my list
       for x in other:                # Add items in other.
           if not x in res:
               res.append(x)
       return Set(res)
   def concat(self, value):           # value: list, Set...
       for x in value:                # Removes duplicates
          if not x in self.data:
               self.data.append(x)
   def __len__(self):          return len(self.data)        # len(self)
   def __getitem__(self, key): return self.data[key]        # self[i]
   def __and__(self, other):   return self.intersect(other) # self & other
   def __or__(self, other):    return self.union(other)     # self | other
   def __repr__(self):         return 'Set:' + `self.data`  # Print

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Pseudo-Private Class Attributes

Content preview·Buy reprint rights for this chapter

In Part IV, we learned that every name assigned at the top level of a file is exported by a module. By default, the same holds for classes—data hiding is a convention, and clients may fetch or change any class or instance attribute they like. In fact, attributes are all "public" and "virtual" in C++ terms; they're all accessible everywhere and all looked up dynamically at runtime.

That's still true today. However, Python also includes the notion of name "mangling" (i.e., expansion), to localize some names in classes. This is sometimes misleadingly called private attributes—really, it's just a way to localize a name to the class that created it, and does not prevent access by code outside the class. That is, this feature is mostly intended to avoid namespace collisions in instances, not to restrict access to names in general.

Pseudo-private names are an advanced feature, entirely optional, and probably won't be very useful until you start writing large class hierarchies in multi-programmer projects. But because you may see this feature in other people's code, you need to be somewhat aware of it even if you don't use it in your own.

Here's how name mangling works. Names inside a class statement that start with two underscores (and don't end with two underscores) are automatically expanded to include the name of the enclosing class. For instance, a name like __X within a class named Spam is changed to _Spam__X automatically: a single underscore, the enclosing class's name, and the rest of the original name. Because the modified name is prefixed with the name of the enclosing class, it's somewhat unique; it won't clash with similar names created by other classes in a hierarchy.

Name mangling happens only in class statements and only for names you write with two leading underscores. Within a class, though, it happens to every name preceded with double underscores wherever they appear. This includes both method names and instance attributes. For example, an instance attribute reference self.__X is transformed to self._Spam__X

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

"New Style" Classes in Python 2.2

Content preview·Buy reprint rights for this chapter

In Release 2.2, Python introduced a new flavor of classes, known as "new style" classes; the classes covered so far in this part of the book are known as "classic classes" when comparing them to the new kind.

New style classes are only slightly different than classic classes, and the ways in which they differ are completely irrelevent to the vast majority of Python users. Moreover, the classic class model, which has been with Python for over a decade, still works exactly as we have described previously.

New style classes are almost completely backward-compatible with classic classes, in both syntax and behavior; they mostly just add a few advanced new features. However, because they modify one special case of inheritance, they had to be introduced as a distinct tool, so as to avoid impacting any existing code that depends on the prior behavior.

New style classes are coded with all the normal class syntax we have studied. The chief coding difference is that you subclass from a built-in type (e.g., list) to produce a new style class. A new built-in name, object, is provided to serve as a superclass for new style classes if no other built-in type is appropriate to use:

class newstyle(object):
    ...normal code...

More generally, any object derived from object or other built-in type is automatically treated as a new style class. By derived, we mean that this includes subclasses of object, subclasses of subclasses of object, and so on—as long as a built-in is somewhere in the superclass tree. Classes not derived from built-ins are considered classic.

Perhaps the most visible change in new style classes is their slightly different treatment of inheritance for the so-called diamond pattern of multiple inheritance trees—where more than one superclass leads to the same higher superclass further above. The diamond pattern is an advanced design concept, which we have not even discussed for normal classes.

In short, with classic classes inheritance search is strictly depth first, and then left to right—Python climbs all the way to the top before it begins to back up and look to the right in the tree. In new style classes, the search is more breadth-first in such cases—Python chooses a closer superclass to the right before ascending all the way to the common superclass at the top. Because of this change, lower superclasses can overload attributes of higher superclasses, regardless of the sort of multiple inheritance trees they are mixed into.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Class Gotchas

Content preview·Buy reprint rights for this chapter

Most class issues can usually be boiled down to namespace issues (which makes sense, given that classes are just namespaces with a few extra tricks). Some of the topics in this section are more like case studies of advanced class usage than problems, and one or two of these have been eased by recent Python releases.

Theoretically speaking, classes (and class instances) are all mutable objects. Just as with built-in lists and dictionaries, they can be changed in place, by assigning to their attributes. And like lists and dictionaries, this also means that changing a class or instance object may impact multiple references to it.

That's usually what we want (and is how objects change their state in general), but this becomes especially critical to know when changing class attributes. Because all instances generated from a class share the class's namespace, any changes at the class level are reflected in all instances, unless they have their own versions of changed class attributes.

Since classes, modules, and instances are all just objects with attribute namespaces, you can normally change their attributes at runtime by assignments. Consider the following class; inside the class body, the assignment to name a generates an attribute X.a, which lives in the class object at runtime and will be inherited by all of X's instances:

>>> class X:
...     a = 1        # Class attribute
...
>>> I = X(  )
>>> I.a              # Inherited by instance
1
>>> X.a
1

So far so good—this is the normal case. But notice what happens when we change the class attribute dynamically outside the class statement: it also changes the attribute in every object that inherits from the class. Moreover, new instances created from the class during this session or program get the dynamically set value, regardless of what the class's source code says:

>>> X.a = 2          # May change more than X
>>> I.a              # I changes too.
2
>>> J = X(  )             # J inherits from X's runtime values
>>> J.a              # (but assigning to J.a changes a in J, not X or I).
2

Is this a useful feature or a dangerous trap? You be the judge, but you can actually get work done by changing class attributes, without ever making a single instance. This technique can simulate "records" or "structs" in other languages. As a refresher on this technique, consider the following unusual but legal Python program:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part VI Exercises

Content preview·Buy reprint rights for this chapter

These exercises ask you to write a few classes and experiment with some existing code. Of course, the problem with existing code is that it must be existing. To work with the set class in exercise 5, either pull down the class source code off the Internet (see Preface) or type it up by hand (it's fairly small). These programs are starting to get more sophisticated, so be sure to check the solutions at the end of the book for pointers.

See Section B.6 for the solutions.

Inheritance. Write a class called Adder that exports a method add(self, x, y) that prints a "Not Implemented" message. Then define two subclasses of Adder that implement the add method:
ListAdder
With an add method that returns the concatenation of its two list arguments
DictAdder
With an add method that returns a new dictionary with the items in both its two dictionary arguments (any definition of addition will do)
Experiment by making instances of all three of your classes interactively and calling their add methods.
Now, extend your Adder superclass to save an object in the instance with a constructor (e.g., assign self.data a list or a dictionary) and overload the + operator with an __add__ to automatically dispatch to your add methods (e.g., X+Y triggers X.add(X.data,Y)). Where is the best place to put the constructors and operator overload methods (i.e., in which classes)? What sorts of objects can you add to your class instances?
In practice, you might find it easier to code your add methods to accept just one real argument (e.g., add(self,y)), and add that one argument to the instance's current data (e.g., self.data+y). Does this make more sense than passing two arguments to add? Would you say this makes your classes more "object-oriented"?
Operator overloading. Write a class called Mylist that shadows ("wraps") a Python list: it should overload most list operators and operations including +, indexing, iteration, slicing, and list methods such as

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 24: Exception Basics

Content preview·Buy reprint rights for this chapter

Part VII deals with exceptions, which are events that can modify the flow of control through a program. In Python, exceptions are triggered automatically on errors, and can be both triggered and intercepted by your code. They are processed by three statements we'll study in this part, the first of which has two variations:

try/except: Catch and recover from exceptions raised by Python, or by you.
try/finally: Perform cleanup actions whether exceptions occur or not.
raise: Trigger an exception manually in your code.
assert: Conditionally trigger an exception in your code.

With a few exceptions (pun intended), we'll find that exception handling is simple in Python, because it's integrated into the language itself as another high-level tool.

In a nutshell, exceptions let us jump out of arbitrarily large chunks of a program. Consider the pizza-making robot we talked about earlier in the book. Suppose we took the idea seriously and actually built such a machine. To make a pizza, our culinary automaton would need to execute a plan, which we implement as a Python program. It would take an order, prepare the dough, add toppings, bake the pie, and so on.

Now, suppose that something goes very wrong during the "bake the pie" step. Perhaps the oven is broken. Or perhaps our robot miscalculates its reach and spontaneously bursts into flames. Clearly, we want to be able to jump to code that handles such states quickly. Since we have no hope of finishing the pizza task in such unusual cases, we might as well abandon the entire plan.

That's exactly what exceptions let you do; you can jump to an exception handler in a single step, abandoning all suspended function calls. They're a sort of structured "super-goto." An exception handler (try statement) leaves a marker and executes some code. Somewhere further ahead in the program, an exception is raised that makes Python jump back to the marker immediately, without resuming any active functions that were called since the marker was left. Code in the exception handler can respond to the raised exception as appropriate (calling the fire department, for instance). Moreover, because Python jumps to the handler statement immediately, there is usually no need to check status codes after every call to a function that could possibly fail.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Why Use Exceptions?

Content preview·Buy reprint rights for this chapter

In Python programs, exceptions are typically used for a variety of purposes. Here are some of their most common roles:

Error handling: Python raises exceptions whenever it detects errors in programs at runtime. You can either catch and respond to the errors in your code, or ignore the exception. If the error is ignored, Python's default exception-handling behavior kicks in—it stops the program and prints an error message. If you don't want this default behavior, code a try statement to catch and recover from the exception—Python jumps to your

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exception Handling: The Short Story

Content preview·Buy reprint rights for this chapter

Compared to some other core language topics we've met, exceptions are a fairly light-weight tool in Python. Because they are so simple, let's jump right into an initial example. Suppose you've coded the following function:

>>> def fetcher(obj, index):
...     return obj[index]

There's not much to this function—it simply indexes an object on a passed-in index. In normal operation, it returns the result of legal indexes:

>>> x = 'spam'
>>> fetcher(x, 3)           # Like x[3]
'm'

However, if you ask this function to index off the end of your string, you will trigger an exception when your function tries to run obj[index]. Python detects out-of-bounds sequence indexing, and reports it by raising (triggering) the built-in IndexError exception:

>>> fetcher(x, 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in fetcher
IndexError: string index out of range

Technically, because this exception is not caught by your code, it reaches the top level of the program and invokes the default exception handler— which simply prints the standard error message. By this point in the book, you've probably seen your share of standard error messages. They include the exception that was raised, along with a stack trace—a list of the lines and functions active when the exception occurred. When coding interactively, the file is just "stdin" (standard input stream) or "pyshell" (in IDLE), so file line numbers are not very meaningful here.

In a more realistic program launched outside the interactive prompt, the default handler at the top also terminates the program immediately. That course of action makes sense for simple scripts; errors often should be fatal, and the best you can do is inspect the standard error message. Sometimes this isn't what you want, though. Server programs, for instance, typically need to remain active even after internal errors. If you don't want the default exception behavior, wrap the call in a try statement to catch the exception yourself:

>>> try:
...     fetcher(x, 4)
... except IndexError:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The try/except/else Statement

Content preview·Buy reprint rights for this chapter

The try is another compound statement; its most complete form is sketched below. It starts with a try header line followed by a block of (usually) indented statements, then one or more except clauses that identify exceptions to be caught, and an optional else clause at the end. The words try, except, and else are associated by indenting them the same—they line up vertically. For reference, here's the general format:

try:
    <statements>         # Run this action first.
except <name1>:
    <statements>         # Run if name1 is raised during try block.
except <name2>, <data>:
    <statements>         # Run if name2 is raised, and get extra data.
except (name3, name4):
    <statements>         # Run if any of these exceptions occur.
except:
    <statements>         # Run for all (other) exceptions raised.
else:
    <statements>         # Run if no exception was raised by try block.

In this statement, the block under the try header represents that main action of the statement—the code you're trying to run. The except clauses define handlers for exceptions raised during the try block, and the else clause (if coded) provides a handler to be run if no exception occurs. The <data> entry here has to do with a feature of raise statements we will discuss later in this chapter.

Here's how try statements work. When a try statement is started, Python marks the current program context, so it can come back if an exception occurs. The statements nested under the try header are run first. What happens next depends on whether exceptions are raised while the try block's statements are running:

If an exception occurs while the try block's statements are running, Python jumps back to the try and runs the statements under the first except clause that matches the raised exception. Control continues past the entire try statement after the except block runs (unless the except block raises another exception).
If an exception happens in the try block and no except clause matches, the exception is propagated up to a try that was entered earlier in the program, or to the top level of the process (which makes Python kill the program and print a default error message).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The try/finally Statement

Content preview·Buy reprint rights for this chapter

The other flavor of the try statement is a specialization and has to do with finalization actions. If a finally clause is used in a try, its block of statements are always run by Python "on the way out," whether an exception occurred while the try block was running or not. Its general form:

try:
    <statements>        # Run this action first.
finally:
    <statements>        # Always run this code on the way out.

Here's how this variant works. Python begins by running the statement block associated with the try header line first. The remaining behavior of this statement depends on whether an exception occurs during the try block or not:

If no exception occurs while the try block is running, Python jumps back to run the finally block, and then continues execution past the entire try statement.
If an exception does occur during the try block's run, Python comes back and runs the finally block, but then propagates the exception to a higher try or the top-level default handler; the program does not resume execution past the try statement.

The try/finally form is useful when you want to be completely sure that an action happens after some code runs, regardless of the exception behavior of the program. Note that the finally clause cannot be used in the same try statement as except and else, so it is best thought of as a distinct statement form.

We saw simple try/finally examples earlier. Here's a more realistic example that illustrates a typical role for this statement:

MyError = "my error"
def stuff(file):
    raise MyError
file = open('data', 'r')     # Open an existing file.
try:
    stuff(file)              # Raises exception
finally:
    file.close(  )                # Always close file.
...                          # Continue here if no exception.

In this code, we've wrapped a call to a file-processing function in a try with a finally clause, to make sure that the file is always closed, whether the function triggers an exception or not.

This particular example's function isn't all that useful (it just raises an exception), but wrapping calls in

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The raise Statement

Content preview·Buy reprint rights for this chapter

To trigger exceptions explicitly, you code raise statements. Their general form is simple—the word raise, optionally followed by both the name of the exception to be raised and an extra data item to pass with the exception:

raise <name>            # Manually trigger an exception.
raise <name>, <data>    # Pass extra data to catcher too.
raise                   # Reraise the most recent exception.

The second form allows you to pass an extra data item along with the exception, to provide details for the handler. In the raise, the data is listed after the exception name; back in the try statement, the data is obtained by including a variable to receive it. For instance, in except name,X:, X will be assigned the extra data item listed at the raise. The third raise form simply reraises the current exception; it's handy if you want to propagate an exception you've caught to another handler.

So what's an exception name? It might be the name of a built-in exception from the built-in scope (e.g., IndexError), or the name of an arbitrary string object you've assigned in your program. It can also reference a user-defined class or class instance—a possibility that further generalizes raise statement formats. We'll postpone the details of this generalization until after we have a chance to study class exceptions in the next chapter.

Regardless of how you name exceptions, they are always identified by normal objects, and at most one is active at any given time. Once caught by an except clause anywhere in the program, an exception dies (won't propagate to another try), unless reraised by another raise statement or error.

Python programs can trigger both built-in and user-defined exceptions, using the raise statement. In their simplest form, user-defined exceptions are string objects, like the one that variable MyBad is assigned to in the following:

MyBad = "oops"
def stuff(  ):
    raise MyBad              # Trigger exception manually.
try:
    stuff(  )                     # Raises exception
except MyBad:
    print 'got it'           # Handle exception here.
...                          # Resume execution here.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The assert Statement

Content preview·Buy reprint rights for this chapter

As a somewhat special case, Python includes the assert statement. It is mostly syntactic shorthand for a common raise usage pattern, and can be thought of as a conditional raise statement. A statement of the form:

assert <test>, <data>          # The <data> part is optional.

works like the following code:

if __debug__:
    if not <test>:
        raise AssertionError, <data>

In other words, if the test evaluates to false, Python raises an exception, with the data item as the exception's extra data (if provided). Like all exceptions, the assertion error exception raised will kill your program if not caught with a try.

As an added feature, assert statements may also be removed from the compiled program's byte code if the -O Python command-line flag is used, thereby optimizing the program. AssertionError is a built-in exception, and the __debug__ flag is a built-in name that is automatically set to 1 (true) unless the -O flag is used.

Assertions are typically used to verify program conditions during development. When displayed, their error message text automatically includes source code line information, and the value you listed in the assert statement. Consider asserter.py:

def f(x):
    assert x < 0, 'x must be negative'
    return x ** 2
% python
>>> import asserter
>>> asserter.f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "asserter.py", line 2, in f
    assert x < 0, 'x must be negative'
AssertionError: x must be negative

It's important to keep in mind that assert is mostly intended for trapping user-defined constraints, not for catching genuine programming errors. Because Python traps programming errors itself, there is usually no need to code asserts to catch things like out-of-bounds indexes, type mismatches, and zero divides:

def reciprocal(x):
    assert x != 0     # a useless assert!
    return 1 / x      # python checks for zero automatically

Such asserts are generally superfluous. Because Python raises exceptions on errors automatically, you might as well let Python do the job for you. For another example of

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 25: Exception Objects

Content preview·Buy reprint rights for this chapter

So far, we've been deliberately vague about what an exception actually is. Python generalizes the notion of exceptions—they may be identified by either string or class objects. Both have merits, but classes tend to provide a better solution when it comes to maintaining exception hierarchies.

In all the examples we've seen up to this point, user-defined exceptions have been strings. This is the simpler way to code an exception—any string value can be used to identify an exception:

>>> myexc = "My exception string"
>>> try:
...     raise myexc
... except myexc:
...     print 'caught'
...
caught

Technically, the exception is identified by the string object, not the string value—you must use the same variable (i.e., reference) to raise and catch the exception (we'll expand on this idea in a gotcha at the conclusion of Part VII). Here, the exception name myexc is just a normal variable—it can be imported from a module, and so on. The text of the string is almost irrelevant, except that it shows up in standard error messages:

>>> raise myexc
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
My exception string

The text of the string exception here is printed as the exception message. If your string exceptions may print like this, you'll want to use more meaningful text than most of the examples shown in this book.

Strings are a simple way to define your exceptions. Exceptions may also be identified with classes. Like some other topics we've met in this book, class exceptions are an advanced topic you can choose to use or not in Python 2.2. However, classes have some added value that merits a quick look; in particular, they allow us to identify exception categories that are more flexible to use and maintain than simple strings. Moreover, classes are likely to become the prescribed way to identify your exceptions in the future.

The chief difference between string and class exceptions has to do with the way that exceptions raised are matched against

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String-Based Exceptions

Content preview·Buy reprint rights for this chapter

In all the examples we've seen up to this point, user-defined exceptions have been strings. This is the simpler way to code an exception—any string value can be used to identify an exception:

>>> myexc = "My exception string"
>>> try:
...     raise myexc
... except myexc:
...     print 'caught'
...
caught

>>> raise myexc
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
My exception string

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Class-Based Exceptions

Content preview·Buy reprint rights for this chapter

The chief difference between string and class exceptions has to do with the way that exceptions raised are matched against except clauses in try statements:

String exceptions are matched by simple object identity: the raised exception is matched to except clauses by Python's is test (not ==).
Class exceptions are matched by superclass relationships: the raised exception matches an except clause, if that except clause names the exception's class or any superclass of it.

That is, when a try statement's except clause lists a superclass, it catches instances of that superclass, as well as instances of all its subclasses lower in the class tree. The net effect is that class exceptions support the construction of exception hierarchies: superclasses become category names, and subclasses become specific kinds of exceptions within a category. By naming a general exception superclass, an except clause can catch an entire category of exceptions—any more specific subclass will match.

Let's look at an example to see how class exceptions work in code. In the following file, classexc.py, we define a superclass General and two subclasses of it called Specific1 and Specific2. We're illustrating the notion of exception categories here: General is a category name, and its two subclasses are specific types of exceptions within the category. Handlers that catch General will also catch any subclasses of it, including Specific1 and Specific2.

class General:            pass
class Specific1(General): pass
class Specific2(General): pass
def raiser0(  ):
    X = General(  )          # Raise superclass instance.
    raise X
def raiser1(  ):
    X = Specific1(  )        # Raise subclass instance.
    raise X
def raiser2(  ):
    X = Specific2(  )        # Raise different subclass instance.
    raise X
for func in (raiser0, raiser1, raiser2):
    try:
        func(  )
    except General:        # Match General or any subclass of it.
        import sys
        print 'caught:', sys.exc_type
C:\python>

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

General raise Statement Forms

Content preview·Buy reprint rights for this chapter

With the addition of class-based exceptions, the raise statement can take the following five forms: the first two raise string exceptions, the next two raise class exceptions, and the last reraises the current exception (useful if you need to propagate an arbitrary exception).

raise string            # Matches except with same string object
raise string, data      # Pass optional extra data (default=None).
raise instance          # Same as: raise instance.__class__ instance.
raise class, instance   # Matches except with this class or its superclass
raise                   # Reraise the current exception.

For class-based exceptions, Python always requires an instance of the class. Raising an instance really raises the instance's class; the instance is passed along with the class as the extra data item (it's a good place to store information for the handler). For backward compatibility with Python versions in which built-in exceptions were strings, you can also use these forms of the raise statement:

raise class                     # Same as: raise class(  )
raise class, arg                # Same as: raise class(arg)
raise class, (arg, arg,...)     # Same as: raise class(arg, arg,...)

These are all the same as saying

raise
class(arg...)

, and therefore the same as the

raise
instance

form above. Specifically, if you list a class instead of an instance, and the extra data item is not an instance of the class listed, Python automatically calls the class with the extra data items as constructor arguments to create and raise an instance for you.

For example, you may raise an instance of the built-in KeyError exception by saying simply

raise
KeyError

, even though KeyError is now a class; Python calls KeyError to make an instance along the way. In fact, you can raise KeyError, and any other class-based exception, in a variety of ways:

raise KeyError(  )              # Normal: raise an instance
raise KeyError, KeyError(  )    # Class, instance: uses instance
raise KeyError                  # Class: instance will be generated
raise KeyError, "bad spam"      # Class, arg: instance is generated

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 26: Designing with Exceptions

Content preview·Buy reprint rights for this chapter

This chapter rounds out Part VII, with a collection of exception design topics and examples, followed by this part's gotchas and exercises. Because this chapter also closes out the core language material of this book, it also includes a brief overview of development tools, by way of migration to the rest of this book.

Our examples so far have used only a single try to catch exceptions, but what happens if one try is physically nested inside another? For that matter, what does it mean if a try calls a function that runs another try? Technically, try statements can nest in terms of both syntax, and the runtime control flow through your code.

Both these cases can be understood if you realize that Python stacks try statements at runtime. When an exception is raised, Python returns to the most recently entered try statement with a matching except clause. Since each try statement leaves a marker, Python can jump back to earlier trys by inspecting the markers stacked. This nesting of active handlers is what we mean by "higher" handlers—try statements entered earlier in the program's execution flow.

For example, Figure 26-1 illustrates what occurs when try/except statements nest at runtime. Because the amount of code that can go into a try clause block can be substantial (e.g., function calls), it will typically invoke other code that may be watching for the same exception. When the exception is eventually raised, Python jumps back to the most recently entered try statement that names that exception, runs that statement's except clauses, and then resumes after that try.

Figure 26-1: nested try/except

Once the exception is caught, its life is over—control does not jump back to all matching trys that names the exception, just one. In Figure 26-1, for instance, the raise in function func2 sends control back to the handler in func1, and then the program continues within func1.

By contrast, when try/finally statements are used, control runs the finally block on exceptions, but then continues propagating the exception to other

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Nesting Exception Handlers

Content preview·Buy reprint rights for this chapter

Figure 26-1: nested try/except

By contrast, when try/finally statements are used, control runs the finally block on exceptions, but then continues propagating the exception to other trys, or to the top-level default handler (standard error message printer). As Figure 26-2 illustrates, the finally clauses do not kill the exception—they just specify code to be run on the way out, during the exception propagation process. If there are many try/finally clauses active when an exception occurs, they will

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exception Idioms

Content preview·Buy reprint rights for this chapter

We've seen the mechanics behind exceptions. Now, let's take a look at some of the other ways they are typically used.

In Python, all errors are exceptions, but not all exceptions are errors. For instance, we saw in Chapter 7 that file object read methods return empty strings at the end of a file. The built-in raw_input function that we first met in Chapter 3, and deployed in an interactive loop in Chapter 10, reads a line of text from the standard input stream (sys.stdin). Unlike file methods, raw_input raises the built-in EOFError at end of file, instead of returning an empty string (an empty string from raw_input means an empty line).

Despite its name, the EOFError exception is just a signal in this context, not an error. Because of this behavior, unless end-of-file should terminate a script, raw_input often appears wrapped in a try handler and nested in a loop, as in the following code.

while 1:
    try:
        line = raw_input(  )     # Read line from stdin.
    except EOFError:
        break                    # Exit loop at end of file
    else:
        ...process next line here...

Other built-in exceptions are similarly signals, not errors. Python also has a set of built-in exceptions that represent warnings, rather than errors. Some of these are used to signal use of deprecated (phased out) language features. See the standard library manual's description of built-in exceptions and the warnings module for more on warnings.

User-defined exceptions can also signal nonerror conditions. For instance, a search routine can be coded to raise an exception when a match is found, instead of returning a status flag that must be interpreted by the caller. In the following, the try/except/else exception handler does the work of an if/else return value tester:

Found = "Item found"
def searcher(  ):
    if ...success...:
        raise Found
    else:
        return
try:
    searcher(  )
except Found:              # Exception if item was found
    ...success...
else:                      # else returned: not found
    ...failure...

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exception Design Tips

Content preview·Buy reprint rights for this chapter

By and large, exceptions are easy to use in Python. The real art behind them is deciding how specific or general your except clauses should be, and how much code to wrap up in try statements. Let's address the second of these first.

In principle, you could wrap every statement in your script in its own try, but that would just be silly (the try statements would then need to be wrapped in try statements!). This is really a design issue that goes beyond the language itself, and becomes more apparent with use. But here are a few rules of thumb:

Operations that commonly fail are generally wrapped in try statements. For example, things that interface with system state, such as file opens, socket calls, and the like, are prime candidates for try.
However, there are exceptions to the prior rule—in simple scripts, you may want failures of such operations to kill your program, instead of being caught and ignored. This is especially true if the failure is a show-stopper. Failure in Python means a useful error message (not a hard crash), and this is often the best outcome you could hope for.
Implement termination actions in try/finally statements, in order to guarantee their execution. This statement form allows you to run code whether exceptions happen or not.
It is sometimes more convenient to wrap the call to a large function in a single try statement, rather than littering the function itself with many try statements. That way, all exceptions in the function percolate up to the try around the call, and you reduce the amount of code within the function.

On to the issue of handler generality. Because Python lets you pick and choose which exceptions to catch, you sometimes have to be careful to not be too inclusive. For example, you've seen that an empty except clause catches every exception that might be raised while the code in the try block runs.

That's easy to code and sometimes desirable, but you may also wind up intercepting an error that's expected by a try handler higher up in the exception nesting structure. For example, an exception handler such as the following catches and stops every exception that reaches it—whether or not another handler is waiting for it:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exception Gotchas

Content preview·Buy reprint rights for this chapter

There isn't much to trip over with exceptions, but here are two general pointers on use, one of which summarizes concepts we've already met.

When an exception is raised (by you or by Python itself), Python searches for the most recently entered try statement with a matching except clause, where matching means the same string object, the same class, or a superclass of the raised class. It's important to notice that matching is performed by identity, not equality. For instance, suppose we define two string objects we want to raise as exceptions:

>>> ex1 = 'Error: Spam Exception'
>>> ex2 = 'Error: Spam Exception'
>>>
>>> ex1 == ex2, ex1 is ex2
(1, 0)

Applying the == test returns true (1) because they have equal values, but is returns false (0) since they are two distinct string objects in memory. Now, an except clause that names the same string object will always match:

>>> try:
...    raise ex1
... except ex1:
...    print 'got it'
...
got it

But one that lists an equal value, but not an identical object, will fail (assuming the string values are long enough to defeat Python's string object caching mechanism, which is described in Chapter 4 and Chapter 7:

>>> try:
...    raise ex1
... except ex2:
...    print 'Got it'
...
Traceback (innermost last):
  File "<stdin>", line 2, in ?
    raise ex1
Error: Spam Exception

Here, the exception isn't caught, so Python climbs to the top level of the process and prints a stack trace and the exception's text automatically. For strings, be sure to use the same object in the raise and the try. For class exceptions, the behavior is similar, but Python generalizes the notion of exception matching to include superclass relationships.

Perhaps the most common gotchas related to exceptions involve the design guidelines of the prior section. Remember, try to avoid empty except clauses (or you may catch things like system exits), and overly-specific except clauses (use superclass categories instead, to avoid maintenance issues in the future).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Core Language Summary

Content preview·Buy reprint rights for this chapter

Congratulations! This concludes your look at the core Python programming language. If you've gotten this far, you may consider yourself an Official Python Programmer (and should feel free to add Python to your resume the next time you dig it out). You've already seen just about everything there is to see in the language itself—all in much more depth than many practicing Python programmers. In Part II through Part VII of the book, you studied built-in types, statements, and exceptions, as well as tools used to build-up larger program units—functions, modules, and classes and explored design issues, OOP, program architecture, and more.

From this point forward, your future Python career will largely consist of becoming proficient with the toolset available for application-level Python programmming. You'll find this to be an ongoing task. The standard library, for example, contains some 200 modules and the public domain offers more tools still. Because new tools appear constantly, it's possible to spend a decade or more becoming proficient in all these tools. We speak from personal experience here.

In general, Python provides a hierarchy of tool sets:

Built-ins: Built-in types like strings, lists, and dictionaries make it easy to write simple programs fast.
Python extensions: For more demanding tasks, you can extend Python, by writing your own functions, modules, and classes.
C extensions: Although not covered in this book, Python can also be extended with modules written in C or C++.

Because Python layers its tool sets, you can decide how deeply your programs need to delve into this hierarchy for any given task—use built-ins for simple scripts, add Python-coded extensions for larger systems, and code C extensions for advanced work. You've covered the first two of these categories above in this book already, and that's plenty to do substantial programming in Python.

The next part of this book takes you on a tour of standard modules and common tasks in Python. Table 26-1 summarizes some of the sources of built-in or existing functionality available to Python programmers, and topics you'll explore in the remainder of this book. Up until now, most of our examples have been very small and self-contained. We wrote them that way on purpose to help you master the basics. But now that you know all about the core language, it's time to start learning how to use Python's built-in interfaces to do real work. You'll find that with a simple language like Python, common tasks are often much easier than you might expect.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Part VII Exercises

Content preview·Buy reprint rights for this chapter

Since we're at the end of the core language coverage, we'll work on a few short exception exercises to give you a chance to practice the basics. Exceptions really are a simple tool, so if you get these, you've got exceptions mastered.

See Section B.7 for the solutions.

try/except. Write a function called oops that explicitly raises an IndexError exception when called. Then write another function that calls oops inside a try/except statement to catch the error. What happens if you change oops to raise KeyError instead of IndexError? Where do the names KeyError and IndexError come from? (Hint: recall that all unqualified names come from one of four scopes, by the LEGB rule.)
Exception objects and lists. Change the oops function you just wrote to raise an exception you define yourself, called MyError, and pass an extra data item along with the exception. You may identify your exception with either a string or a class. Then, extend the try statement in the catcher function to catch this exception and its data in addition to IndexError, and print the extra data item. Finally, if you used a string for your exception, go back and change it to a class instance; what now comes back as the extra data to the handler?
Error handling. Write a function called safe(func,*args) that runs any function using apply, catches any exception raised while the function runs, and prints the exception using the exc_type and exc_value attributes in the sys module. Then, use your safe function to run the oops function you wrote in exercises 1 and/or 2. Put safe in a module file called tools.py, and pass it the oops function interactively. What sort of error messages do you get? Finally, expand safe to also print a Python stack trace when an error occurs by calling the built-in print_exc( ) function in the standard traceback module (see the Python library reference manual for details).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 27: Common Tasks in Python

Content preview·Buy reprint rights for this chapter

At this point in the book, you have been exposed to a fairly complete survey of the more formal aspects of the language (the syntax, the data types, etc.). In this chapter, we'll "step out of the classroom" by looking at a set of basic computing tasks and examining how Python programmers typically solve them, hopefully helping you ground the theoretical knowledge with concrete results.

Python programmers don't like to reinvent wheels when they already have access to nice, round wheels in their garage. Thus, the most important content in this chapter is the description of selected tools that make up the Python standard library—built-in functions, library modules, and their most useful functions and classes. While you most likely won't use all of these in any one program, no useful program avoids all of these. Just as Python provides a list object type because sequence manipulations occur in all programming contexts, the library provides a set of modules that will come in handy over and over again. Before designing and writing any piece of generally useful code, check to see if a similar module already exists. If it's part of the standard Python library, you can be assured that it's been heavily tested; even better, others are committed to fixing any remaining bugs—for free.

The goal of this chapter is to expose you to a lot of different tools, so that you know that they exist, rather than to teach you everything you need to know in order to use them. There are very good sources of complementary knowledge once you've finished this book. If you want to explore more of the standard library, the definitive reference is the Python Library Reference, currently over 600 pages long. It is the ideal companion to this book; it provides the completeness we don't have the room for, and, being available online, is the most up-to-date description of the standard Python toolset. Three other O'Reilly books provide excellent additional information: the Python Pocket Reference, written by Mark Lutz, which covers the most important modules in the standard library, along with the syntax and built-in functions in compact form; Fredrik Lundh's

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exploring on Your Own

Content preview·Buy reprint rights for this chapter

Before digging into specific tasks, we should say a brief word about self-exploration. We have not been exhaustive in coverage of object attributes or module contents in order to focus on the most important aspects of the objects under discussion. If you're curious about what we've left out, you can look it up in the Library Reference, or you can poke around in the Python interactive interpreter, as shown in this section.

The dir built-in function returns a list of all of the attributes of an object, and, along with the type built-in, provides a great way to learn about the objects you're manipulating. For example:

>>> dir([  ])                             # What are the attributes of lists?
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', 
'__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__',
'__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', 
'__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', 
'__repr__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 
'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

What this tells you is that the empty list object has a few methods: append, count, extend, index, insert, pop, remove, reverse, sort, and a lot of "special methods" that start with an underscore (_) or two (__). These are used under the hood by Python when performing operations like +. Since these special methods are not needed very often, we'll write a simple utility function that will not display them:

>>> def mydir(obj):
...     orig_dir = dir(obj)
...     return [item for item in orig_dir if not item.startswith('_')]
...     
>>>

Using this new function on the same empty list yields:

>>> mydir([  ])                             # What are the attributes of lists?
['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

You can then explore any Python object:

>>> mydir((  ))                                # What are the attributes of tuples?
[  ]                                        # Note: no "normal" attributes
>>>

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Conversions, Numbers, and Comparisons

Content preview·Buy reprint rights for this chapter

While we've covered data types, one of the common issues when dealing with any type system is how one converts from one type to another. These conversions happen in a myriad of contexts—reading numbers from a text file, computing integer averages, interfacing with functions that expect different types than the rest of an application, etc.

We've seen in previous chapters that we can create a string from a nonstring object by simply passing the nonstring object to the str string constructor. Similarly, unicode converts any object to its Unicode string form and returns it.

In addition to the string creation functions, we've seen list and tuple, which take sequences and return list and tuple versions of them, respectively. int, complex, float, and long take any number and convert it to their respective types. int, long, and float have additional features that can be confusing. First, int and long truncate their numeric arguments, if necessary, to perform the operation, thereby losing information and performing a conversion that may not be what you want (the round built-in rounds numbers the standard way and returns a float). Second, int, long, and float can also convert strings to their respective types, provided the strings are valid integer (or long, or float) literals. Literals are the text strings that are converted to numbers early in the Python compilation process. So, the string 1244 in your Python program file (which is necessarily a string) is a valid integer literal, but

def
foo( )

: isn't.

>>> int(1.0), int(1.4), int(1.9), round(1.9), int(round(1.9))
(1, 1, 1, 2.0, 2)
>>> int("1")
1
>>> int("1.2")                             # This doesn't work.
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for int(  ): 1.2

What's a little odd is that the rule about conversion (if it's a valid integer literal) is more important than the feature about truncating numeric arguments, thus:

>>> int("1.0")                               # Neither does this
Traceback (most recent call last):           # since 1.0 is also not a valid 
  File "<stdin>", line 1, in ?               # integer literal.
ValueError: invalid literal for int(  ): 1.0

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Manipulating Strings

Content preview·Buy reprint rights for this chapter

The vast majority of programs perform string operations. We've covered most of the properties and variants of string objects in Chapter 5, but there are two areas that we haven't touched on thus far, the string module, and regular expressions. As we'll see the first is simple and mostly a historical note, while the second is complex and powerful.

The string module is somewhat of a historical anomaly. If Python were being designed today, the string module would not exist—it is mostly a remnant of a less civilized age before everything was a first-class object. Nowadays, string objects have methods like split and join, which replace the functions that are still defined in the string module. The string module does define a convenient function, maketrans, used to automatically do string "mapping" operations with the translate method of string objects. maketrans/translate is useful when you want to translate several characters in a string at once. For example, if you want to replace all occurrences of the space character with an underscore, change underscores to minus signs, and change minus signs to plus signs. Doing so with repeated .replace( ) operations is in fact quite tricky, but doing it with maketrans is trivial:

>>> import string
>>> conversion = string.maketrans(" _-", "_-+")
>>> input_string = "This is a two_part - one_part"
>>> input_string.translate(conversion)
'This_is_a_two-part_+_one-part'

In addition, the string module defines a few useful constants, which haven't been implemented as string attributes yet. These are shown in Table 27-2.

Table 27-2: String module constants
Constant name	Value
`digits`	'`0123456789`'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Data Structure Manipulations

Content preview·Buy reprint rights for this chapter

One of Python's greatest features is that it provides the list, tuple, and dictionary built-in types. They are so flexible and easy to use that once you've grown used to them, you'll find yourself reaching for them automatically. While we covered all of the operations on each data structure as we introduced them, now's a good time to go over tasks that can apply to all data structures, such as how to make copies, sort objects, randomize sequences, etc. Many functions and algorithms (theoretical procedures describing how to implement a complex task in terms of simpler basic tasks) are designed to work regardless of the type of data being manipulated. It is therefore useful to know how to do generic things for all data types.

Making copies of objects is a reasonable task in many programming contexts. Often, the only kind of copy that's needed is just another reference to an object, as in:

x = 'tomato'
y = x                   # y is now 'tomato'.
x = x + ' and cucumber' # x is now 'tomato and cucumber', but y is unchanged.

Due to Python's reference management scheme, the statement a = b doesn't make a copy of the object referenced by b; instead, it makes a new reference to that same object. When the object being copied is an immutable object (e.g., a string), there is no real difference. When dealing with mutable objects like lists and dictionaries, however, sometimes a real, new copy of the object, not just a shared reference, is needed. How to do this depends on the type of the object in question. The simplest way of making a copy is to use the list( ) or tuple( ) constructors:

newList = list(myList)
newTuple = tuple(myTuple)

As opposed to the simplest, the most common way to make copies of sequences like lists and tuples is somewhat odd. If myList is a list, then to make a copy of it, you can use:

newList = myList[:]

which you can read as "slice from beginning to end," since the default index for the start of a slice is the beginning of the sequence (0), and the default index for the end of a slice is the end of sequence (see Chapter 3). Since tuples support the same slicing operation as lists, this same technique can also be applied to tuples, except that if

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Manipulating Files and Directories

Content preview·Buy reprint rights for this chapter

So far so good—we know how to create objects, we can convert between different data types, and we can perform various kinds of operations on them. In practice, however, as soon as one leaves the computer science classroom one is faced with tasks that involve manipulating data that lives outside of the program and performing processes that are external to Python. That's when it becomes very handy to know how to talk to the operating system, explore the filesystem, read and modify files.

The os module provides a generic interface to the operating system's most basic set of tools. Different operating systems have different behaviors. This is true at the programming interface as well. This makes it hard to write so-called "portable" programs, which run well regardless of the operating system. Having generic interfaces independent of the operating system helps, as does using an interpreted language like Python. The specific set of calls it defines depend on which platform you use. (For example, the permission-related calls are available only on platforms that support them, such as Unix and Windows.) Nevertheless, it's recommended that you always use the os module, instead of the platform-specific versions of the module (called by such names as posix, nt, and mac). Table 27-4 lists some of the most often used functions in the os module. When referring to files in the context of the os module, one is referring to filenames, not file objects.

Table 27-4: Most frequently used functions from the os module
Function name	Behavior
`getcwd( )`	Returns a string referring to the current working directory (`cwd`): >>> print os.getcwd( ) h:\David\book

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Internet-Related Modules

Content preview·Buy reprint rights for this chapter

Python is used in a wide variety of Internet-related tasks, from making web servers to crawling the Web to "screen-scraping" web sites for data. This section briefly describes the most often used modules used for such tasks that ship with Python's core. For more detailed examples of their use, we recommend Lundh's Standard Python Library and Martelli and Ascher's Python Cookbook (O'Reilly). There are many third-party add-ons worth knowing about before embarking on a significant web- or Internet-related project.

Python programs often process forms from web pages. To make this task easy, the standard Python distribution includes a module called cgi. Chapter 28 includes an example of a Python script that uses the CGI, so we won't cover it any further here.

Universal resource locators are strings such as https://www.python.org that are now ubiquitous. Three modules—urllib, urllib2, and urlparse—provide tools for processing URLs.

The urllib module defines a few functions for writing programs that must be active users of the Web (robots, agents, etc.). These are listed in Table 27-9.

Table 27-9: Functions of the urllib module
Function name	Behavior
`urlopen(url [, data])`	Opens (for reading) a network object denoted by a URL; it can also open local files: >>> page = urlopen('https://www.python.org') >>> page.readline( ) '<HTML>\012' >>> page.readline( ) '<!-- THIS PAGE IS AUTOMATICALLY GENERATED.DO NOT EDIT. -->\012'
`urlretrieve(url [, filename][, hook])`	Copies a network object denoted by a URL to a local file (uses a cache):

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Executing Programs

Content preview·Buy reprint rights for this chapter

The last set of built-in functions in this section have to do with creating, manipulating, and calling Python code. See Table 27-13.

Table 27-13: Ways to execute Python code
Name	Behavior
`import`	Executes the code in a module as part of the importing and binds it to a name in the scope in which it is executed. You can choose what name is chosen by using the `import` modulename `as` name form.
`exec code [ in globaldict [, localdict]]`	Executes the specified code (string, file, or compiled code object) in the optionally specified global and local namespaces. This is sometimes useful when reading programs from user-entered code as in an interactive shell or "macro" window.
`compile(string, filename, kind)`	Compiles the string into a code object. This function is only useful as an optimization.
`execfile(filename[, globaldict[, localdict]])`	Executes the program in the specified filename, using the optionally specified global and local namespaces. This function is sometimes useful in systems which use Python as an extension language for the users of the system.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Debugging, Testing, Timing, Profiling

Content preview·Buy reprint rights for this chapter

To wrap up our overview of common Python tasks, we'll cover some tasks that are common for Python programmers even though they're not programming tasks per se—debugging, testing, timing, and optimizing Python programs.

The first task is, not surprisingly, debugging. Python's standard distribution includes a debugger called pdb. Using pdb is fairly straightforward. You import the pdb module and call its run method with the Python code the debugger should execute. For example, if you're debugging the program in spam.py, do this:

>>> import spam                        # Import the module we want to debug.
>>> import pdb                         # Import pdb.
>>> pdb.run('instance = spam.Spam(  )') # Start pdb with a statement to run.
> <string>(0)?(  )
(Pdb) break spam.Spam.__init__                # We can set break points.
(Pdb) next
>        <string>(1)?(  )
(Pdb) n                                        # 'n' is short for 'next'.
> spam.py(3)__init__(  )
-> def __init__(self):
(Pdb) n
> spam.py(4)__init__(  )
-> Spam.numInstances = Spam.numInstances + 1
(Pdb) list                                     # Show the source code listing.
  1    class Spam:
  2        numInstances = 0
  3 B      def __init__(self):                  # Note the B for Breakpoint.
  4  ->        Spam.numInstances = Spam.numInstances + 1  # Where we are
  5        def printNumInstances(self):
  6            print "Number of instances created: ", Spam.numInstances
  7
[EOF]
(Pdb) where                                    # Show the calling stack.
<string>(1)?(  )
> spam.py(4)__init__(  )
-> Spam.numInstances = Spam.numInstances + 1
(Pdb) Spam.numInstances = 10          # Note that we can modify variables
(Pdb) print Spam.numInstances         # while the program is being debugged.
10
(Pdb) continue                        # This continues until the next break-
--Return--                            # point, but there is none, so we're
-> <string>(1)?(  )->None                 # done.
(Pdb) c

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exercises

Content preview·Buy reprint rights for this chapter

This chapter is full of programs we encourage you to type in and play with. However, here are a few more challenging exercises:

See Section B.8.1 for the solutions.

Avoiding regular expressions. Write a program that obeys the same requirements as pepper.py but doesn't use regular expressions to do the job. This is somewhat difficult, but a useful exercise in building program logic.
Wrapping a text file with a class. Write a class that takes a filename and reads the data in the corresponding file as text. Make it so that this class has three attributes: paragraph, line, word, each of which take an integer argument, so that if mywrapper is an instance of this class, printing mywrapper.paragraph(0) prints the first paragraph of the file, mywrapper.line(-2) prints the next-to-last line in the file, and mywrapper.word(3) prints the fourth word in the file.
Describing a directory. Write a function that takes a directory name and describes the contents of the directory, recursively (in other words, for each file, print the name and size, and proceed down any eventual directories).
Modifying the prompt. Modify your interpreter so that the prompt is, instead of the >>> string, a string describing the current directory and the count of the number of lines entered in the current Python session. Two hints: the prompt variables (e.g., sys.ps1) doesn't have to be a string but can be any object; printing an instance can have side effects, and is done by calling the instance's __repr__ method.
Writing a shell. Using the Cmd class in the cmd module and the functions described in this chapter for manipulating files and directories, write a little shell that accepts the standard Unix commands (or DOS commands): ls (dir) for listing the current directory, cd for changing directory, mv (or ren) for moving/renaming a file, and cp (copy) for copying a file.
Redirecting stdout. Modify the mygrep.py script to output to the last file specified on the command line instead of to the console.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 28: Frameworks

Content preview·Buy reprint rights for this chapter

All the examples in this book so far have been quite small and they may seem toys compared to real-world applications. This chapter shows some of the frameworks that are available to Python programmers who wish to build such applications in some specific domains. A framework can be thought of as a domain-specific set of classes and expected patterns of interactions between these classes. We mention just three here: the COM framework for interacting with Microsoft's Component Object Model, the Tkinter graphical user interface (GUI), and the Swing Java GUI toolkit.

We will illustrate the power of frameworks using a hypothetical scenario, that of a small company's web site, and the need to collect, maintain, and respond to customer input about the product through a web form. We will describe three programs in this scenario. The first program is a web-based data entry form that asks the user to enter some information in their web browser, and then saves that information on disk. The second program uses the same data and automatically uses Microsoft Word to print out a customized form letter based on that information. The final example is a simple browser for the saved data built with the Tkinter module, which uses the Tk GUI, a powerful, portable toolkit for managing windows, buttons, menus, etc. Hopefully, these examples will make you realize how these kinds of toolkits, when combined with the rapid development power of Python, can truly let you build real applications fast. Each program builds on the previous one, so we strongly recommend that you read through each program, even if you don't wish to get them up and running on your computer.

The last section of this chapter covers Jython, the Java port of Python. The chapter closes with a medium-sized Jython program that allows users to manipulate mathematical functions graphically using the Swing toolkit.

The scenario for this example is that of a startup company, Joe's Toothpaste, Inc., which sells the latest in 100% organic, cruelty-free, tofu-based toothpaste. Since there is only one employee, and that employee is quite busy shopping for the best tofu he can find, the tube doesn't say "For customer complaints or comments, call 1-800-TOFTOOT," but instead, says, "If you have a complaint or wish to make a comment, visit our web site at www.toftoot.com." The web site has all the usual glossy pictures and an area where the customer can enter a complaint or comment. This page looks like Figure 28-1.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

An Automated Complaint System

Content preview·Buy reprint rights for this chapter

Figure 28-1: What the customer finds at https://www.toftoot.com/comment.html

The key parts of the HTML code that generated this page are:

<form method="post" action="https://toftoot.com/cgi-bin/feedback.py">
<ul><i>Please fill out the entire form:</i></ul>
<center><table width="100%" >
<tr>
    <td align="right" width="20%">Name:</td>
    <td>
        <input type="text" name="name" size="50" value="carview.php?tsp=">
    </td>
</tr>
<tr>
    <td align="right">Email Address:</td>
    <td>
        <input type="text" name="email" size="50" value="carview.php?tsp=">
    </td>
</tr>
<tr>
    <td align="right">Mailing Address:</td>
    <td>
       <input type="text" name="address" size="50" value="carview.php?tsp=">
    </td>
</tr>
<tr>
    <td align="right">Type of Message:</td>
    <td>
        <input type="radio" name="type" checked 
               value="comment">comment&nbsp;</input>
        <input type="radio" name="type" 
               value="complaint">complaint</input>
    </td>
</tr>
<tr>
    <td align="right" valign="top">
         Enter the text in here:</td>
    <td><textarea name="text" rows="5" cols="50" value="carview.php?tsp=">
        </textarea></td></tr>
<tr>
    <td></td>
    <td>
    <input type="submit" name="send" value="Send the feedback!">
    </td>
</tr>
</table></center>
</form>

We assume that you know enough about CGI and HTML to follow this discussion. The HTML code generates the web page shown in Figure 28-1:

The form line specifies what CGI program should be invoked when the form is submitted; specifically, the URL points to a script called

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Interfacing with COM: Cheap Public Relations

Content preview·Buy reprint rights for this chapter

At this point, we have a program that is run whenever users fill in the feedback form and that writes out instances of the feedback data to files on the server. We'll use this data to do two things. First, a program that's run periodically (say, at 2 a.m., every night) will look through the saved data, find out which saved pickled files correspond to complaints, and print out a customized letter to the complainer. The second use we'll make of that data is a GUI browser to look through the stored feedback entries. All this sounds sophisticated, but you'll be surprised at how simple it is using the right tools. Joe's web site is on a Windows machine, but other platforms work in similar ways.

Before we talk about how to write this program, a word about the technology it uses, namely Microsoft's Component Object Model (COM). COM is, among other things, a standard for interaction between programs, which allows COM-compliant programs to talk to, access the data in, and execute commands in other COM-compliant programs. Roughly speaking, the program doing the calling is called a COM client, and the program doing the executing is called a COM server. All major Microsoft products are COM-aware and can act as servers. Microsoft Word is one of these, and the one we'll use here, since Microsoft Word is just fine for writing letters, which is what we're doing. Luckily for us, Python can be made COM-aware as well, on Windows. Mark Hammond and Greg Stein have made available a set of extensions to Python for Windows called win32com, which allow Python programs to do almost everything you can do with COM from any other language. You can write COM clients, servers, ActiveX scripting hosts, debuggers, and more, all in Python. We only need to do the first of these tasks, which is also the simplest. The basic tasks that our form letter generator program needs to do are:

Open all of the pickled files in the appropriate directory and unpickle them to turn them back into Python objects.
For each unpickled instance, test if the feedback is a complaint. If it is, find out the name and address of the person who filled out the form and go on to Step 3. If not, skip it.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

A Tkinter-Based GUI Editor for Managing Form Data

Content preview·Buy reprint rights for this chapter

Let's recap: we wrote a CGI program (feedback.py) that takes the input from a web form and stores the information on disk on our server. We then wrote a program (formletter.py) that takes some of those files and generates apologies to those deserving them. The next task is to construct a program to allow a human to look at the comments and complaints, using the Tkinter toolkit to build a GUI browser for these files.

The Tkinter toolkit is a Python-specific interface to a non-Python GUI library called Tk. Tk is the GUI toolkit most commonly chosen by Python programmers because it provides professional-looking GUIs within a fairly easy-to-use system and because the Python/Tk interface comes with most Python distributions. The interfaces it generates don't look exactly like Windows, the Mac, or any Unix toolkit, but they look very close to each of them, and the same Python program works on all those platforms, an impossible task with any platform-specific toolkit. Two other portable toolkits worth considering are wxPython (https://www.wxPython.org) and PyQt.

Tk, therefore, is what we'll use in this example. It's a toolkit developed by John Ousterhout, originally as a companion to Tcl, another scripting language. Since then, Tk has been adopted by many other scripting languages including Python and Perl.

The goals of this program are simple: to display in a window a description of each feedback instance, allowing the user to select one to examine in greater detail (e.g., seeing the contents of the text widget). Furthermore, Joe wants to be able to discard items once they have been dealt with. A screenshot of the finished program in action is shown in Figure 28-4.

Figure 28-4: A sample screen dump of the feedbackeditor.py program

We'll work through one possible way of coding the program to manage form data. Our entire program, called feedbackeditor.py, is:

from FormEditor import FormEditor
from feedback import FeedbackData, FormData 
from Tkinter import mainloop
FormEditor("Feedback Editor", FeedbackData, feedback.DIRECTORY)
mainloop(  )

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Jython: The Felicitous Union of Python and Java

Content preview·Buy reprint rights for this chapter

Jython is a version of Python written entirely in Java. Jython is very exciting for both the Python community and the Java community. Python users are happy that their current Python knowledge can be applied in Java-based projects; Java programmers are happy that they can use the Python scripting language as a way to control their Java systems, test libraries, and learn about Java libraries by working in a powerful interactive environment.

Jython is available from https://www.jython.org, with license and distribution terms similar to those of CPython (which is what the reference implementation of Python is called when contrasted with Jython).

The Jython installation includes several parts:

jython, which is the equivalent of the python program used throughout the book.
jythonc, which takes a Jython program and compiles it to Java class files. The resulting Java class files can be used as any Java class file can, for example, as applets, as servlets, or as beans.
A set of modules that provide the Jython user with the vast majority of the modules in the standard Python library.
A few programs demonstrating various aspects of Jython programming.

Using Jython is very similar to using Python:

~/book> jython
Jython 2.1 on java1.3.1_03 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> 2 + 3
5

In fact, Jython works almost identically to CPython. For an up-to-date listing of the differences between the two, see https://www.jython.org/docs/differences.html. The most important differences are:

Jython is currently slower than CPython. How much slower depends on the test code used and on the Java Virtual Machine Jython is using.
Some of the built-ins or library modules aren't available for Jython. For example, the os.system( ) call is not implemented yet, as doing so is difficult given Java's interaction with the underlying operating system. Also, some of the largest extension modules, such as the Tkinter GUI framework, aren't available, because the underlying tools (the Tk/Tcl toolkit, in the case of Tkinter) aren't available in Java.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Exercises

Content preview·Buy reprint rights for this chapter

Most of the topics this chapter are not really good topics for exercises without first covering the frameworks they cover. A couple of things can be done with the knowledge you already have, however:

See Section B.8.2 for the solutions.

Faking the Web. You may not have a web server running, which makes using formletter.py and FormEditor.py difficult, since they use data generated by the CGI script. As an exercise, write a program that creates files with the same properties as those created by the CGI script.
Cleaning up. There's a serious problem with the formletter.py program: namely, if, it's run nightly, any complaint is going to cause a letter to be printed. That will happen every night, since there is no mechanism for indicating that a letter has been generated and that no more letters need be generated regarding that specific complaint. Fix this problem.
Adding parametric plotting to grapher.py. Modify grapher.py to allow the user to specify expressions that return both x and y values, instead of the current just y solution. For example, the user should be able to write in the Expression widget: sin(x/3. 1),cos(x/6.15) (note the comma: this is a tuple!) and get a picture like that shown in Figure 28-7.

Figure 28-7: Output of Exercise 3

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 29: Python Resources

Content preview·Buy reprint rights for this chapter

Programming languages, like natural languages, grow tight-knit communities— speakers of the same language have a natural affinity for each other. Programming languages, being the result of individual choice rather than accident of birth, lead to stronger feelings of kinship than might seem reasonable for what some see as a purely technological topic. Open source languages, which are never chosen because of a marketing campaign, but instead after a process of deliberation and comparison, seem to elicit even more enthusiasm (some would even say fanaticism) from their users. This chapter is about the community that defines itself as being "The Python Community," from the inner sanctum of people who dream about Python daily to the occasional Python user.

Because writing shareable Python code is so easy, much of this community shares their enthusiasm and their work, quite often in the form of yet more free software. The resulting snowball effect (or, to use a more trendy term, the "network effect") makes writing even large programs a snap compared to many other language choices. This chapter will point out some of the most valuable third-party offerings, from small modules to interfaces to operating system libraries to module repositories.

In this section, we will discuss the various layers of the community, from the core of very serious programmers who implement the official Python interpreter, out through the Python Software Foundation, special interest groups and user groups, and out to the broad spectrum of participants, which is known as "python-list." You'll probably find that you belong in one or more of these neighborhoods, and may want to visit some of the neighborhoods you don't yet know. Regardless of where you choose to settle, welcome to our community!

Unlike many language communities, the Python world has a very clear center. This center has grown over the years, with Guido van Rossum as the permanent core, surrounded physically by a trusty cohort called "Pythonlabs," and surrounded virtually by the "python-dev group." Pythonlabs consists of a few key Python developers(Tim Peters, Barry Warsaw, Jeremy Hylton, and Fred Drake) who were recruited by Guido to work with him on Python and Python-related projects, first at BeOpen.com, and then at Zope Corporation. Along with Guido, they have generally been the ones making the most radical changes to the Python internals, although there have been some notable exceptions from other contributors.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Layers of Community

Content preview·Buy reprint rights for this chapter

Python is too big for so few people to manage and grow as fast as its users would like (especially as Python is only a part-time job, even for Guido). A supporting cast, generally referred to as the python-dev crowd after the mailing list that anchors the discussions, is available to help in design discussions, implementing, testing, and, most of all, arguing (all in good faith, though).

To most Python users, however, the work of Pythonlabs and Python-dev is gratefully acknowledged but somewhat mysterious. Many more people live in one of the outer layers of Pythondom, either virtual or physical (or, hopefully, both). We'll get back to the technically "deep" layers later—it's far more reasonable to learn about a community from the tourism bureau than from the city planning committee, however.

While most Python-related communications occur on the Internet, it's nice to ground the names with faces and accents, and to get a feel for the real-world personalities behind the online personas. There are two great ways to do that, each with their own benefits: user groups and conferences.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Process

Content preview·Buy reprint rights for this chapter

The topic of how Python is developed is a fascinating one, as Python is one of the better run open source projects, but it is mostly off-topic for this book. If you're interested in learning more (whether you think you can contribute to Python yourself or not), read some of the information at https://www.python.org/dev—you'll find everything from descriptions of the Python developer culture to specific technical details on how to contribute.

As a user of Python, however, you can play a role in the (unlikely) event that you find a bug in Python. If you do, you should isolate the code that's causing you headaches or not behaving according to specification to the bare minimum, and post it as a bug on Python's bug tracker. As of this writing, Python is using the bug tracker run by Sourceforge, although there is talk to move to something else. The bug manager is located at https://sourceforge.net/bugs/?group_id=5470, and instructions on how to submit a bug report are at https://www.python.org/doc/current/ext/reporting-bugs.html.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Services and Products

Content preview·Buy reprint rights for this chapter

There are hundreds of thousands or perhaps millions of people using Python, and thousands of companies relying on it. Several commercial vendors, both corporations and individual consultants provide support of various kinds to help people and companies work with Python, from training to development tools to on-call support. Not altogether surprisingly, the first author's main job is to teach Python to individuals and companies worldwide and the second author's company provides developer tools and enterprise-level support for Python. Many other vendors exist—consult your usual channels to find the provider most suited to your needs.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Legal Framework: The Python Software Foundation

Content preview·Buy reprint rights for this chapter

Python is, at its core, a programming language. As a technology, Python needs an owner—a person or entity who can define it to the world, protect it from attack, and nurture its growth. While Guido van Rossum is the recognized father of Python, Guido doesn't want to be the sole person responsible for Python. The "what if Guido gets hit by a bus?" discussions have been dealt with by, over the years, defining a legal entity called the Python Software Foundation (PSF) to act as Python's legal owner. Thus, all of the Python intellectual property is being assigned to the PSF. The PSF has received provisional non-profit status from the US IRS, thus making donations to the organization tax-deductible. The PSF is composed of individual members (invited by the existing membership because of their contribution to Python) and is funded by corporate sponsorships. Information about the PSF is available at https://www.python.org/psf/. Both of the authors are members of the PSF and the second author is a director of the PSF. Essentially, what this means is that the PSF is an organization we believe is important to the long-term health of Python.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Software

Content preview·Buy reprint rights for this chapter

This book's task was to present the Python language, including brief overviews of some of the most important modules and libraries that come with Python. There are countless other such supporting software packages available, most of them for free, on the Internet. In this section, we give you pointers as to what this landscape of software looks like, what maps are available to help you find what you're looking for, and finally some notable software packages that can make choosing Python such a high-value choice.

One of Python's weaknesses has been the lack of a single, authoritative repository for such third-party software. While there are volunteers working hard to solve that problem, the best we can do at time of writing is to point you to the several alternative methods that can be used to find out what's available and where.

It used to be hard to find things on the Internet. Some of us remember days before the Web, when word of mouth and secret handshakes seemed to be required to find particular pieces of software. These days, search engines like Google do 95% of the hard work. Regardless of the topic, searches on Google are very likely to get you what you want.

Software that's available on the Web has typically been announced in public, or at the very least discussed in public. You can search the various mailing lists mentioned above with specialized search engines, such as Google's Groups interface (although that doesn't cover all of the Python mailing lists, only those mirrored as newsgroups), https://python.org/search, or the mailing list archives at https://aspn.activestate.com/ASPN.

"The Vaults of Parnassus" is a fairly old (in Internet years) and well-established directory of Python software. It uses a library-style directory of Python software and Python-related tools. The vaults are at: https://py.vaults.ca/. Note that the vaults archive only metadata finding something on the vaults is no guarantee that the pages it refers you to are still around, or that the information on the vaults are necessarily up to date.

A new project which, unlike some of its predecessors, seems likely to succeed is called PyPI (Python Package Index). Hosted at

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Popular Third-Party Software

Content preview·Buy reprint rights for this chapter

In this section, we list some of the most popular third-party add-ons to Python. Some are small yet deeply useful modules, others are full fledged applications with massive internal complexity. Each is what we consider a good tool.

URLs change, so don't be dissapointed if the URLs we mention are no longer valid by the time you type them in. Instead, go to Google and type python name of the package—you're more than likely to find it.

While each operating system provides a wide variety of interfaces, Unix and related operating systems like Linux tend to provide that interface through command-line tools and special-purpose files, both of which tend to vary too much across versions to allow for useful programmatic interfaces. Windows and Macintosh use a more API-oriented approach, and as a result make the operating system more naturally accessible from a programming language like Python. There are Python interfaces to pretty much every corner of Windows and the Macintosh APIs.

Section 29.6.1.1: Windows

Core Python comes with some interfaces to basic Windows interfaces like the os module for basic operating system functions and the _regedit low-level API to the Windows Registry. Serious Windows programming, however, requires access to many more Windows libraries. Most of these are exposed by the win32all package by Mark Hammond. win32all is available either as an add-on to the Python distribution from https://www.python.org, or bundled as part of ActivePython from ActiveState (https://www.ActiveState.com/Python). win32all also includes a Window-only IDE for Python, Pythonwin.

Not all Windows APIs are exposed by win32all. Should you wish to use one of these, you can use Thomas Heller's ctypes module, which provides a foreign function interface from Python to dynamically loaded shared libraries. ctypes is described below.

Section 29.6.1.2: Macintosh

The Macintosh port of Python, maintained by Jack Jansen, and available at https://www.cwi.nl/~jack/macpython.html, comes in two kinds as of this writing. There is a new version that runs from the Mac OS X command line, as well as a version which runs on Mac OS 9 or OS X, although there are plans to merge the two. The documentation for the Macintosh library, the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Web Application Frameworks

Content preview·Buy reprint rights for this chapter

Anyone who has used the Internet recently knows that it's very possible to develop full-featured applications that happen to use web browsers as their GUI. In the last few years, a plethora of web application development frameworks for Python have been developed. They span a huge gamut, from simple hacks to elaborate systems. Many allow embedding of Python code in HTML, others generate HTML from Python code. Many provide support for persistence, cookie management, URL rewriting, and more. A comparative review of all of these is far beyond the scope of this book. We'll simply mention some of the more commonly used frameworks.

Zope is the grand-daddy of Python web application frameworks. While Zope is open source, it is very much the product of a company called Zope Corp., in collaboration with a vast user community. Zope is a very powerful tool for building content management systems, including such advanced features as replication, transactional support, sophisticated security models and workflow. Zope often stumps people who expect it to be a simple system. While efforts are under way to redesign part of Zope to make learning Zope easier, those who find Zope most useful are typically those with very large or complex websites to build. Information about the Zope software is available at https://www.zope.org, and information about Zope Corp. is at https://www.zope.com.

If Zope is the web application server for sites with sophisticated workflow, Twisted is more of a swiss army knife for networked application development. Not strictly focused on the web and the HTTP/HTML standards that anchor the web, Twisted is a framework for building networked applications. Using Twisted, it is relatively easy to build high-performance clients and servers for any protocol, from instant messaging to IRC to HTTP to NNTP. The Twisted framework, like all frameworks, requires a certain learning curve. Those who do learn it do tend to be passionate about it, and it seems to perform admirably. Twisted's home is https://www.twistedmatrix.com.

Quixote is a dynamic web application framework built by Python programmers for Python programmers. Unlike many of the alternatives such as Zope, Quixote deliberately does not try to cater to web designers. To use Quixote means to program in Python, even for HTML generation. To those of us who are more comfortable with Python modules and classes than with HTTP redirects, however, that's a great benefit. Quixote has wonderfully clear documentation (at least if you are comfortable with Python) at

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Tools for Python Developers

Content preview·Buy reprint rights for this chapter

Given the large number of Python developers and the large number of Python programs that need editing and debugging, there is a wide variety of editing and development tools available for Python programmers. They range from customization files for free general-purpose editors like Emacs and Vim to specialized integrated development environments. They can be free, such as Idle (which comes with Python) and Pythonwin (part of win32all and ActivePython), or commercial products like Archaeopteryx's Wing IDE and ActiveState's Komodo and Visual Python .NET. Be sure to research what's available at the time you need it, as this is an area where new tools and new revisions of existing tools show up fairly often.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Appendix A: Installation and Configuration

Content preview·Buy reprint rights for this chapter

This appendix provides additional installation and configuration details, as a resource for people new to such topics.

Section A.1: Installing the Python Interpreter

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Installing the Python Interpreter

Content preview·Buy reprint rights for this chapter

Because you need the Python interpreter to run Python scripts, your first step to using Python is usually installing Python. Unless a Python is already available on your machine, you'll need to fetch, install, and possibly configure a Python on your computer. You only need to do this once per machine, and perhaps not at all, if you will be running a frozen binary.

First off, before doing anything, make sure you don't already have a recent Python on your machine. For instance, if you are working on Linux, or some Unix systems, Python is probably already installed. Type python at a shell prompt and see what happens; alternatively, try searching the usual places (/usr/bin, /usr/local/bin, etc.). On Windows, check if there is a Python entry in the programs menu you find in your Start button, at the bottom left of the screen. Make sure the Python you find is version 2.2 or later; you'll need that to run some of the examples in this edition.

If there is no Python to be found, you will need to install one yourself. You can always fetch the latest and greatest standard Python release from https://www.python.org, Python's official web site; look for the Downloads link on that page, and grab a release for the platform you will be working on. There, you'll find prebuilt Python executables (unpack and run); self-installer executables for Windows (click to install); RPMs for Linux (unpack with rpm); the full source-code distribution (compile on your machine to generate an interpreter); and more. For some platforms such as PalmOS and PocketPC, Python's web site links to an offsite page where these versions are maintained.

You can also find Python on CD-ROMs supplied with Linux distributions, included with some products and computer systems, sold by commercial outlets such as Dr. Dobb's Journal, and enclosed with other Python books. These tend to lag behind the current release somewhat, but usually not seriously so.

In addition, a company called ActiveState also distributes Python, as part of its ActivePython package. This package combines standard CPython with extensions for Windows development, an IDE called PythonWin (described in Chapter 3), and other commonly used extensions. See ActiveState's web site,

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Appendix B: Solutions to Exercises

Content preview·Buy reprint rights for this chapter

Section B.1: Part I, Getting Started

Section B.2: Part II, Types and Operations

Section B.3: Part III, Statements and Syntax

Section B.4: Part IV, Functions

Section B.5: Part V, Modules

Section B.6: Part VI, Classes and OOP

Section B.7: Part VII, Exceptions and Tools

Section B.8: Part VIII, The Outer Layers

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 3.13 for the exercises.

Interaction. Assuming Python is configured properly, interaction should look something like the following. You can run this any way you like: in IDLE, from a shell prompt, and so on:
```
% python
...copyright information lines...
>>> "Hello World!"
'Hello World!'
>>>                     # Ctrl-D or Ctrl-Z to exit, or window close
```
Programs. Your code (i.e., module) file module1.py and shell interactions should look like:
```
print 'Hello module world!'
% python module1.py
Hello module world!
```
Again, feel free to run this other ways—by clicking its icon, by IDLE's Edit/RunScript menu option, and so on.
Modules. The following interaction listing illustrates running a module file by importing it.
```
% python
>>> import module1
Hello module world!
>>>
```
Remember that you need to reload the module to run again without stopping and restarting the interpreter. The questions about moving the file to a different directory and importing it again is a trick question: if Python generates a module1.pyc file in the original directory, it uses that when you import the module, even if the source code file (.py) has been moved to a directory not on Python's search path. The .pyc file is written automatically if Python has access to the source file's directory and contains the compiled byte-code version of a module. See Part V for more on modules.
Scripts. Assuming your platform supports the #! trick, your solution will look like the following (although your #! line may need to list another path on your machine):
```
#!/usr/local/bin/python          (or #!/usr/bin/env python)
print 'Hello module world!'
% chmod +x module1.py
% module1.py
Hello module world!
```
Errors. The interaction below demonstrates the sort of error messages you get when you complete this exercise. Really, you're triggering Python exceptions; the default exception handling behavior terminates the running Python program and prints an error message and stack trace on the screen. The stack trace shows where you were in a program when the exception occurred. In Part VII, you will learn that you can catch exceptions using

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 7.10 for the exercises.

The basics. Here are the sort of results you should get, along with a few comments about their meaning. Note that ; is used in a few of these to squeeze more than one statement on a single line; the ; is a statement separator.

Numbers
>>> 2 ** 16              # 2 raised to the power 16
65536
>>> 2 / 5, 2 / 5.0       # Integer / truncates, float / doesn't
(0, 0.40000000000000002)
Strings
>>> "spam" + "eggs"      # Concatenation
'spameggs'
>>> S = "ham"
>>> "eggs " + S
'eggs ham'
>>> S * 5                # Repetition
'hamhamhamhamham'
>>> S[:0]                # An empty slice at the front--[0:0]
''
>>> "green %s and %s" % ("eggs", S)  # Formatting
'green eggs and ham'
Tuples
>>> ('x',)[0]                        # Indexing a single-item tuple
'x'
>>> ('x', 'y')[1]                    # Indexing a 2-item tuple
'y'
Lists
>>> L = [1,2,3] + [4,5,6]            # List operations
>>> L, L[:], L[:0], L[-2], L[-2:]
([1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [  ], 5, [5, 6])
>>> ([1,2,3]+[4,5,6])[2:4]
[3, 4]
>>> [L[2], L[3]]                       # Fetch from offsets; store in a list
[3, 4]
>>> L.reverse(  ); L                   # Method: reverse list in-place
[6, 5, 4, 3, 2, 1]
>>> L.sort(  ); L                      # Method: sort list in-place
[1, 2, 3, 4, 5, 6]
>>> L.index(4)                         # Method: offset of first 4 (search)
3
Dictionaries
>>> {'a':1, 'b':2}['b']              # Index a dictionary by key.
2
>>> D = {'x':1, 'y':2, 'z':3}
>>> D['w'] = 0                       # Create a new entry.
>>> D['x'] + D['w']
1
>>> D[(1,2,3)] = 4                   # A tuple used as a key 
                     (immutable)
>>> D
{'w': 0, 'z': 3, 'y': 2, (1, 2, 3): 4, 'x': 1}
>>> D.keys(  ), D.values(  ), D.has_key((1,2,3))          # Methods
(['w', 'z', 'y', (1, 2, 3), 'x'], [0, 3, 2, 4, 1], 1)
Empties
>>> [[  ]], ["carview.php?tsp=",[  ],(  ),{  },None]         # Lots of nothings: empty objects
([[  ]], ['', [  ], (  ), {  }, None])

Indexing and slicing. Indexing out-of-bounds (e.g., L[4]) raises an error; Python always checks to make sure that all offsets are within the bounds of a sequence.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 11.3 for the exercises.

Coding basic loops. As you work through this exercise, you'll wind up with code that looks like the following:

>>> S = 'spam'
>>> for c in S:
...     print ord(c)
...
115
112
97
109
>>> x = 0
>>> for c in S: x = x + ord(c)        # Or: x += ord(c)
...
>>> x
433
>>> x = [  ]
>>> for c in S: x.append(ord(c))
...
>>> x
[115, 112, 97, 109]
>>> map(ord, S)
[115, 112, 97, 109]

Backslash characters. The example prints the bell character (\a) 50 times; assuming your machine can handle it, and when run outside of IDLE, you may get a series of beeps (or one long tone, if your machine is fast enough). Hey—we warned you.
Sorting dictionaries. Here's one way to work through this exercise (see Chapter 6 if this doesn't make sense). Remember, you really do have to split the keys and sort calls up like this, because sort returns None. In Python 2.2, you can iterate through dictionary keys directly without calling keys (e.g., for key in D:), but the keys list will not be sorted like it is by this code:
```
>>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7}
>>> D
{'f': 6, 'c': 3, 'a': 1, 'g': 7, 'e': 5, 'd': 4, 'b': 2}
>>>
>>> keys = D.keys(  )
>>> keys.sort(  )
>>> for key in keys:
...     print key, '=>', D[key]
...
a => 1
b => 2
c => 3
d => 4
e => 5
f => 6
g => 7
```

Program logic alternatives. Here's sample code for the solutions. Your results may vary a bit; this exercise is mostly designed to get you playing with code alternatives, so anything reasonable gets full credit:

L = [1, 2, 4, 8, 16, 32, 64]
X = 5
i = 0
while i < len(L):
    if 2 ** X == L[i]:
        print 'at index', i
        break
    i = i+1
else:
    print X, 'not found'
        
L = [1, 2, 4, 8, 16, 32, 64]
X = 5
for p in L:
    if (2 ** X) == p:
        print (2 ** X), 'was found at', L.index(p)
        break
else:
    print X, 'not found'
L = [1, 2, 4, 8, 16, 32, 64]
X = 5
if (2 ** X) in L:
    print (2 ** X), 'was found at', L.index(2 ** X)
else:
    print X, 'not found'
        
X = 5
L = [  ]
for i in range(7): L.append(2 ** i)
print L
if (2 ** X) in L:
    print (2 ** X), 'was found at', L.index(2 ** X)
else:
    print X, 'not found'
        
X = 5
L = map(lambda x: 2**x, range(7))
print L
if (2 ** X) in L:
    print (2 ** X), 'was found at', L.index(2 ** X)
else:
    print X, 'not found'

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 14.9 for the exercises.

The basics. There's not much to this one, but notice that using print (and hence your function) is technically a polymorphic operation, which does the right thing for each type of object:
```
% python
>>> def func(x): print x
...
>>> func("spam")
spam
>>> func(42)
42
>>> func([1, 2, 3])
[1, 2, 3]
>>> func({'food': 'spam'})
{'food': 'spam'}
```
Arguments. Here's a sample solution. Remember that you have to use print to see results in the test calls, because a file isn't the same as code typed interactively; Python doesn't normally echo the results of expression statements in files.
```
def adder(x, y):
    return x + y
print adder(2, 3)
print adder('spam', 'eggs')
print adder(['a', 'b'], ['c', 'd'])
% python mod.py
5
spameggs
['a', 'b', 'c', 'd']
```
varargs. Two alternative adder functions are shown in the following file, adders.py. The hard part here is figuring out how to initialize an accumulator to an empty value of whatever type is passed in. The first solution, uses manual type testing to look for an integer and an empty slice of the first argument (assumed to be a sequence) otherwise. The second solution, uses the first argument to initialize and scan items 2 and beyond, much like one of the min function variants shown in Chapter 13.
The second solution is better. Both of these assume all arguments are the same type and neither works on dictionaries; as we saw in Part II, + doesn't work on mixed types or dictionaries. We could add a type test and special code to add dictionaries too, but that's extra credit.
```
def adder1(*args):
    print 'adder1',
    if type(args[0]) == type(0):    # Integer?
         sum = 0                    # Init to zero.
    else:                           # else sequence:
         sum = args[0][:0]          # Use empty slice of arg1.
    for arg in args:
        sum = sum + arg
    return sum
def adder2(*args):
    print 'adder2',
    sum = args[0]               # Init to arg1.
    for next in args[1:]:
        sum = sum + next        # Add items 2..N.
    return sum
for func in (adder1, adder2):
    print func(2, 3, 4)
    print func('spam', 'eggs', 'toast')
    print func(['a', 'b'], ['c', 'd'], ['e', 'f'])
% 
```

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 18.8 for the exercises.

Basics, import. This one is simpler than you may think. When you're done, your file and interaction should look close to the following code (file mymod.py); remember that Python can read a whole file into a string or lines list, and the len built-in returns the length of strings and lists:
```
def countLines(name):
    file = open(name, 'r')
    return len(file.readlines(  ))
def countChars(name):
    return len(open(name, 'r').read(  ))
def test(name):                                  # Or pass file object
    return countLines(name), countChars(name)    # Or return a dictionary
% python
>>> import mymod
>>> mymod.test('mymod.py')
(10, 291)
```
On Unix, you can verify your output with a wc command; on Windows, right-click on your file to views its properties. But note that your script may report fewer characters than Windows does—for portability, Python converts Windows \r\n line-end markers to \n, thereby dropping one byte (character) per line. To match byte counts with Windows exactly, you have to open in binary mode (rb) or add back the number of lines.
Incidentally, to do the "ambitious" part (passing in a file object, so you only open the file once), you'll probably need to use the seek method of the built-in file object. We didn't cover it in the text, but it works just like C's fseek call (and calls it behind the scenes): seek resets the current position in the file to an offset passed in. After a seek, future input/output operations are relative to the new position. To rewind to the start of a file without closing and reopening, call file.seek(0); the file read methods all pick up at the current position in the file, so you need to rewind to reread. Here's what this tweak would look like:
```
def countLines(file):
    file.seek(0)                      # Rewind to start of file.
    return len(file.readlines(  ))
def countChars(file): 
    file.seek(0)                      # Ditto (rewind if needed)
    return len(file.read(  ))
def test(name):
    file = open(name, 'r')                       # Pass file object.
    return countLines(file), countChars(file)    # Open file only once.
>>> 
```

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 23.5 for the exercises.

Inheritance. Here's the solution code for this exercise (file adder.py), along with some interactive tests. The __add__ overload has to appear only once, in the superclass, since it invokes type-specific add methods in subclasses.

class Adder:
    def add(self, x, y):
        print 'not implemented!'
    def __init__(self, start=[  ]):
        self.data = start
    def __add__(self, other):                # Or in subclasses?
        return self.add(self.data, other)    # Or return type?
class ListAdder(Adder):
    def add(self, x, y):
        return x + y
class DictAdder(Adder):
    def add(self, x, y):
        new = {  }
        for k in x.keys(  ): new[k] = x[k]
        for k in y.keys(  ): new[k] = y[k]
        return new
% python
>>> from adder import *
>>> x = Adder(  )
>>> x.add(1, 2)
not implemented!
>>> x = ListAdder(  )
>>> x.add([1], [2])
[1, 2]
>>> x = DictAdder(  )
>>> x.add({1:1}, {2:2})
{1: 1, 2: 2}
>>> x = Adder([1])
>>> x + [2]
not implemented!
>>>
>>> x = ListAdder([1])
>>> x + [2]
[1, 2]
>>> [2] + x
Traceback (innermost last):
  File "<stdin>", line 1, in ?
TypeError: __add__ nor __radd__ defined for these operands

Notice in the last test that you get an error for expressions where a class instance appears on the right of a +; if you want to fix this, use __radd__ methods as described in Section 21.4 in Chapter 21.

If you are saving a value in the instance anyhow, you might as well rewrite the add method to take just one argument, in the spirit of other examples in Part VI:

class Adder:
    def __init__(self, start=[  ]):
        self.data = start
    def __add__(self, other):        # Pass a single argument.
        return self.add(other)           # The left side is in self.
    def add(self, y):
        print 'not implemented!'
class ListAdder(Adder):
    def add(self, y):
        return self.data + y
class DictAdder(Adder):
    def add(self, y):
        pass  # Change me to use self.data instead of x.
x = ListAdder([1,2,3])
y = x + [4,5,6]
print y               # Prints [1, 2, 3, 4, 5, 6]

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 26.6 for the exercises.

try/except. Our version of the oops function (file oops.py) follows. As for the noncoding questions, changing oops to raise KeyError instead of IndexError means that the exception won't be caught by the try handler (it "percolates" to the top level and triggers Python's default error message). The names KeyError and IndexError come from the outermost built-in names scope. Import __builtin__ and pass it as an argument to the dir function to see for yourself.
```
def oops(  ):
    raise IndexError
def doomed(  ):
    try:
        oops(  )
    except IndexError:
        print 'caught an index error!'
    else:
        print 'no error caught...'
if __name__ == '__main__': doomed(  )
% python oops.py
caught an index error!
```

Exception objects and lists. Here's the way we extended this module for an exception of our own (here a string, at first):

MyError = 'hello'
def oops(  ):
    raise MyError, 'world'
def doomed(  ):
    try:
        oops(  )
    except IndexError:
        print 'caught an index error!'
    except MyError, data:
        print 'caught error:', MyError, data
    else:
        print 'no error caught...'
if __name__ == '__main__':
    doomed(  )
% python oops.py
caught error: hello world

To identify the exception with a class, we just changed the first part of the file to this, and saved it as oop_oops.py:

class MyError: pass
def oops(  ):
    raise MyError(  )
...rest unchanged...

Like all class exceptions, the instance comes back as the extra data; our error message now shows both the class, and its instance (<...>).

% python oop_oops.py
caught error: __main__.MyError <__main__.MyError instance at 0x00867550>

Remember, to make this look nicer, you can define a __repr__ or __str__ method in your class to return a custom print string. See Chapter 21 for details.

Error handling. Here's one way to solve this one (file safe2.py). We did our tests in a file, rather than interactively, but the results are about the same.

import sys, traceback
def safe(entry, *args):
    try:
        apply(entry, args)                 # catch everything else
    except:
        traceback.print_exc(  )
        print 'Got', sys.exc_type, sys.exc_value
import oops
safe(oops.oops)
%

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Content preview·Buy reprint rights for this chapter

See Section 27.9 for the exercises.

Avoiding regular expressions. This program is long and tedious, but not especially complicated. See if you can understand how it works. Whether this is easier for you than regular expressions depends on many factors, such as your familiarity with regular expressions and your comfort with the functions in the string module. Use whichever type of programming works for you.

file = open('pepper.txt')
text = file.read(  )
paragraphs = text.split('\n\n')
def find_indices_for(big, small):
    indices = [  ]
    cum = 0
    while 1:
        index = big.find(small)
        if index == -1:
            return indices
        indices.append(index+cum)
        big = big[index+len(small):]
        cum = cum + index + len(small)
def fix_paragraphs_with_word(paragraphs, word):
    lenword = len(word)
    for par_no in range(len(paragraphs)):
        p = paragraphs[par_no]
        wordpositions = find_indices_for(p, word)
        if wordpositions == [  ]: return
        for start in wordpositions:
            # Look for 'pepper' ahead.
            indexpepper = p.find('pepper')
            if indexpepper == -1: return -1
            if p[start:indexpepper].strip(  ):
                # Something other than whitespace in between!
                continue
            where = indexpepper+len('pepper')
            if p[where:where+len('corn')] == 'corn':
                # It's immediately followed by 'corn'!
                continue
            if p.find('salad') < where:
                # It's not followed by 'salad'.
                continue
            # Finally! We get to do a change!
            p = p[:start] + 'bell' + p[start+lenword:]
            paragraphs[par_no] = p         # Change mutable argument!
fix_paragraphs_with_word(paragraphs, 'red')
fix_paragraphs_with_word(paragraphs, 'green')
for paragraph in paragraphs:
    print paragraph+'\n'

We won't repeat the output here; it's the same as that of the regular expression solution.

Wrapping a text file with a class. This one is surprisingly easy, if you understand classes and the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Return to Learning Python

Original Source | Taken Source