Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (162 page)

BOOK: Programming Python
3.01Mb size Format: txt, pdf, ePub
ads

[
65
]
You may notice another difference in the response pages
produced by the form and an explicitly typed URL: for the form, the
value of the “filename” parameter at the end of the URL in the
response may contain URL escape codes for some characters in the
file path you typed. Browsers automatically translate some non-ASCII
characters into URL escapes (just like
urllib.parse.quote
). URL escapes were
discussed earlier in this chapter; we’ll see an example of this
automatic browser escaping at work in an upcoming screenshot.

Chapter 16. The PyMailCGI Server
“Things to Do When Visiting Chicago”

This chapter is the fifth in our survey of Python Internet
programming, and it continues
Chapter 15
’s
discussion. There, we explored the fundamentals of server-side Common
Gateway Interface (CGI) scripting in Python. Armed with that knowledge,
this chapter moves on to a larger case study that underscores advanced CGI
and server-side web scripting topics.

This chapter presents
PyMailCGI—a “webmail” website for reading and sending email
that illustrates security concepts, hidden form fields, URL generation,
and more. Because this system is similar in spirit to the PyMailGUI
program we studied in
Chapter 14
, this
example also serves as a comparison of the web and nonweb application
models. This case study is founded on basic CGI scripting, but it
implements a complete website that does something more useful than
Chapter 15
’s examples.

As usual in this book, this chapter splits its focus between
application-level details and Python programming concepts. For instance,
because this is a fairly large case study, it illustrates system design
concepts that are important in actual projects. It also says more about
CGI scripts in general: PyMailCGI expands on the notions of state
retention and security concerns and encryption.

The system presented here is neither particularly flashy nor feature
rich as websites go (in fact, the initial cut of PyMailCGI was thrown
together during a layover at Chicago’s O’Hare airport). Alas, you will
find neither dancing bears nor blinking lights at this site. On the other
hand, it was written to serve a real purpose, speaks more to us about CGI
scripting, and hints at just how far Python server-side programs can take
us. As outlined at the start of this part of the book, there are
higher-level frameworks, systems, and tools that build upon ideas we will
apply here. For now, let’s have some fun with Python on the Web.

The PyMailCGI Website

In
Chapter 14
, we
built a program called PyMailGUI that implements a complete
Python+
tkinter email client GUI (if
you didn’t read that chapter, you may want to take a quick glance at it
now). Here, we’re going to do something of the same, but on the Web: the
system presented in this section, PyMailCGI, is a collection of CGI
scripts that implement a simple web-based interface for sending and
reading email in any browser. In effect, it is a
webmail
system—though not as powerful as what may be
available from your Internet Service Provider (ISP), its scriptability
gives you control over its operation and future evolution.

Our goal in studying this system is partly to learn a few more CGI
tricks, partly to learn a bit about designing larger Python systems in
general, and partly to underscore the trade-offs between systems
implemented for the Web (the PyMailCGI server) and systems written to run
locally (the PyMailGUI client). This chapter hints at some of these
trade-offs along the way and returns to explore them in more depth after
the presentation of this system.

Implementation Overview

At the top level,
PyMailCGI allows users to view incoming email with the
Post Office Protocol (POP) interface and to send new mail by Simple Mail
Transfer Protocol (SMTP). Users also have the option of replying to,
forwarding, or deleting an incoming email while viewing it. As
implemented, anyone can send email from a PyMailCGI site, but to view
your email, you generally have to install PyMailCGI on your own computer
or web server account, with your own mail server information (due to
security concerns described later).

Viewing and sending email sounds simple enough, and we’ve already
coded this a few times in this book. But the required interaction
involves a number of distinct web pages, each requiring a CGI script or
HTML file of its own. In fact, PyMailCGI is a fairly
linear
system—in the most complex user interaction
scenario, there are six states (and hence six web pages) from start to
finish. Because each page is usually generated by a distinct file in the
CGI world, that also implies six source files.

Technically, PyMailCGI could also be described as a
state machine
, though very little
state is transferred from state to state. Scripts pass user and message
information to the next script in hidden form fields and query
parameters, but there are no client-side cookies or server-side
databases in the current version. Still, along the way we’ll encounter
situations where more advanced state retention tools could be an
advantage.

To help keep track of how all of PyMailCGI’s source files fit into
the overall system, I jotted down the file in
Example 16-1
before starting any real
programming. It informally sketches the user’s flow through the system
and the files invoked along the way. You can certainly use more formal
notations to describe the flow of control and information through states
such as web pages (e.g., dataflow diagrams), but for this simple
example, this file gets the job done.

Example 16-1. PP4E\Internet\Web\PyMailCgi\pageflow.txt

file or script                           creates
-------------- -------
[pymailcgi.html] Root window
=> [onRootViewLink.py] Pop password window
=> [onViewPswdSubmit.py] List window (loads all pop mail)
=> [onViewListLink.py] View Window + pick=del|reply|fwd (fetch)
=> [onViewPageAction.py] Edit window, or delete+confirm (del)
=> [onEditPageSend.py] Confirmation (sends smtp mail)
=> back to root
=> [onRootSendLink.py] Edit Window
=> [onEditPageSend.py] Confirmation (sends smtp mail)
=> back to root

This file simply lists all the source files in the system, using
=>
and indentation to denote the
scripts they trigger.

For instance, links on the
pymailcgi.html
root page
invoke
onRootViewLink.py
and
onRootSendLink.py
, both executable scripts. The
script
onRootViewLink.py
generates a password page,
whose Submit button in turn triggers
onViewPswdSubmit.py
, and so on. Notice that both
the view and the send actions can wind up triggering
onEdit
Page
Send.py
to send a new mail; view
operations get there after the user chooses to reply to or forward an
incoming mail.

In a system such as this, CGI scripts make little sense in
isolation, so it’s a good idea to keep the overall page flow in mind;
refer to this file if you get lost. For additional context,
Figure 16-1
shows the overall contents of this site,
viewed as directory listings under Cygwin on Windows in a shell
window.

Figure 16-1. PyMailCGI contents

When you install this site, all the files you see here are
uploaded to a
PyMailCgi
subdirectory of your web
directory on your server’s machine. Besides the page-flow HTML and CGI
script files invoked by user interaction, PyMailCGI uses a handful of
utility modules:

commonhtml.py

Provides a library of HTML tools

externs.py

Isolates access to modules imported from other places

loadmail.py

Encapsulates mailbox fetches for future expansion

secret.py

Implements configurable password encryption

PyMailCGI also reuses parts of the
mailtools
module
package and
mailconfig.py
module we wrote in
Chapter 13
. The former of these is accessible to
imports from
the
PP4E
package root
, and the latter is largely copied by a local
version in the
PyMailCgi
directory so that it can differ between PyMailGUI and PyMailCGI. The
externs.py
module is intended to hide
these modules’ actual locations, in case
the install structure varies
on some
machines.

In fact, this system again demonstrates the powers of
code reuse
in a practical way. In this edition, it
gets a great deal of logic for free from the new
mailtools
package of
Chapter 13
—message loading, sending, deleting,
parsing, composing, decoding and encoding, and attachments—even though
that package’s modules were originally developed for the PyMailGUI
program. When it came time to update PyMailCGI later, tools for handling
complex things such as attachments and message text searches were
already in place. See
Chapter 13
for
mailtools
source code.

As usual, PyMailCGI also uses a variety of standard library
modules:
smtplib
,
poplib
,
email.*
,
cgi
,
urllib.*
, and the like. Thanks to the reuse of
both custom and standard library code, this system achieves much in a
minimal amount of code. All told, PyMailCGI consists of just 846 lines
of new code, including whitespace, comments, and the top-level HTML file
(see file
linecounts.xls
in this
system’s source directory for details; the prior edition’s version
claimed to be some 835 new lines).

This compares favorably to the size of the PyMailGUI client-side
“desktop” program in
Chapter 14
, but most
of this difference owes to the vastly more limited functionality in
PyMailCGI—there are no local save files, no transfer thread overlap, no
message caching, no inbox synchronization tests or recovery, no
multiple-message selections, no raw mail text views, and so on.
Moreover, as the next section describes, PyMailCGI’s Unicode
policies are substantially more limited in this release,
and although arbitrary attachments can be viewed, sending binary and
some text attachments is not supported in the current version because of
a Python 3.1 issue.

In other words, PyMailCGI is really something of a
prototype
, designed to illustrate web scripting and
system design concepts in this book, and serve as a springboard for
future work. As is, it’s nowhere near as far along the software
evolutionary scale as PyMailGUI. Still, we’ll see that PyMailCGI’s code
factoring and reuse of existing modules allow it to implement much in a
surprisingly small amount of
code.

New in This Fourth Edition (Version 3.0)

In this fourth edition,
PyMailCGI has been ported to run under Python 3.X. In
addition, this version inherits and employs a variety of new features
from the
mailtools
module, including
mail header decoding and encoding, main mail text encoding, the ability
to limit mail headers fetched, and more. Notably, there is new support
for Unicode and Internationalized character sets as follows:

  • For display, both a mail’s main text and its headers are
    decoded prior to viewing, per email, MIME, and Unicode standards;
    text is decoded per mail headers and headers are decoded per their
    content.

  • For sends, a mail’s main text, text attachments, and headers
    are all encoded per the same standards, using UTF-8 as the default
    encoding if required.

  • For replies and forwards, headers copied into the quoted
    message text are also decoded for display.

Note that this version relies upon web browsers’ ability to
display arbitrary kinds of Unicode text. It does not emit any sort of
“meta” tag to declare encodings in the HTML reply pages generated for
mail view and composition. For instance, a properly formed HTML document
can often declare its encoding this way:




Such headers are omitted here. This is in part due to the fact
that the mail might have arbitrary and even mixed types of text among is
message and headers, which might also clash with encoding in the HTML of
the reply itself. Consider a mail index list page that displays headers
of multiple mails; because each mail’s Subject and From might be
encoding in a different character set (one Russian, one Chinese, and so
on), a single encoding declaration won’t suffice (though UTF-8’s
generality can often come to the rescue). Resolving such mixed character
set cases is left to the browser, which may ultimately require
assistance from the user in the form of encoding choices. Such displays
work in PyMailGUI because we pass decoded Unicode text to the tkinter
Text widget, which handles arbitrary Unicode code points well. In
PyMailCGI, we’re largely finessing this issue to keep this example
short.

Moreover, both text and binary attachments of fetched mails are
simply saved in binary form and opened by filename in browsers when
their links are clicked, relying again on browsers to do the right
thing. Text attachments for sends are also subject to the CGI upload
limitations described in the note just ahead. Beyond all this, Python
3.1 appears to have an issue printing some types of Unicode text to the
standard output stream in the CGI context, which necessitates a
workaround in the main utilities module here that opens
stdout
in binary mode and writes text as
encoded bytes (see the code for more details).

This Unicode/i18n
support is substantially less rich than that in PyMailGUI.
However, given that we can’t prompt for encodings here, and given that
this book is running short on time and space in general, improving this
for cases and browsers where it might matter is left as a suggested
exercise.

For more on specific 3.0 fourth-edition changes made, see the
comments marked with “3.0” in the program code files listed ahead. In
addition, all the features added for the prior edition are still here,
as described in the next
section.

Limitation on Sending Attachments in This Edition

If you haven’t already done so, see
CGI File Upload Limitations in 3.1
. In brief, in Python 3.1 the
cgi
module, as well as the
email
package’s parser which it uses, fail
with exceptions when requests submitted by web browsers include raw
binary data or incompatibly encoded text added for uploaded files.
Unfortunately, because this chapter’s PyMailCGI system relies on CGI
uploads for attachments, this limitation means that this system does
not currently support sending emails with binary email attachments
such as images and audio files. It did support this in the prior
edition under Python 2.X.

Such sent attachments still work in
Chapter 14
’s PyMailGUI desktop application,
simply because attachment file data can be read directly from local
files (using binary mode if required, and MIME encoding if needed for
inclusion in email). Because the PyMailCGI webmail system here relies
on CGI uploads to transfer attachments to the server as an extra first
step, though, it’s fully at the mercy of the currently broken
cgi
module’s upload support. Coding a
cgi
replacement is far too
ambitious a goal for this book.

A fix is expected for this in the future, and may be present by
the time you read these words. Being based on Python 3.1, though, this
edition’s PyMailCGI simply cannot support sending such attachments,
though they can still be freely viewed in mails fetched. In fact,
although this edition’s PyMailCGI inherits some new features from
mailtools
such as i18n header
decoding and encoding, this attachment send limitation is severe
enough to preclude expanding this system’s feature set to the same
degree as this edition’s PyMailGUI. For example, Unicode policies are
simple here, if not naive.

It’s possible that some client-side scripting techniques such as
AJAX may be able to transfer attachment files independently, and thus
avoid CGI uploads altogether. However, such techniques would also
require deploying frameworks and technologies
outside
this book’s scope, would imply a
radically different and more complex program structure, and should
probably not be necessitated by a regression in Python 3.X in any
event. A rewrite (PyMailRIA?) will have to await a final verdict on
Python 3.X CGI support fixes.

BOOK: Programming Python
3.01Mb size Format: txt, pdf, ePub
ads

Other books

Wild Swans by Jessica Spotswood
El coleccionista by Paul Cleave
Outburst by Zimmerman, R.D.
Being There by Jerzy Kosinski
Money Men by Gerald Petievich
The Empty Copper Sea by John D. MacDonald
Intrigued by Bertrice Small
Rowan Hood Returns by Nancy Springer