Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (118 page)

BOOK: Programming Python
3.21Mb size Format: txt, pdf, ePub
ads
Downloading Remote Trees

It is possible to
extend this remote tree-cleaner to also download a remote
tree with subdirectories: rather than deleting, as you walk the remote
tree simply create a local directory to match a remote one, and download
nondirectory files. We’ll leave this final step as a suggested exercise,
though, partly because its dependence on the format produced by server
directory listings makes it complex to be robust and partly because this
use case is less common for me—in practice, I am more likely to maintain
a site on my PC and upload to the server than to download a tree.

If you do wish to experiment with a recursive download, though, be
sure to consult the script
Tools\Scripts\ftpmirror.py
in Python’s install or
source tree for hints. That script attempts to download a remote
directory tree by FTP, and allows for various directory listing formats
which we’ll skip here in the interest of space. For our purposes, it’s
time to move on to the next protocol on our tour—Internet
email.

Processing Internet Email

Some of the other most
common, higher-level Internet protocols have to do with
reading and sending email messages: POP and IMAP for fetching email from
servers, SMTP for sending new messages, and other formalisms such as
RFC822
for specifying email message content and format. You don’t
normally need to know about such acronyms when using common email tools,
but internally, programs like Microsoft Outlook and webmail systems
generally talk to POP and SMTP servers to do your bidding.

Like FTP, email ultimately consists of formatted commands and byte
streams shipped over sockets and ports (port 110 for POP; 25 for SMTP).
Regardless of the nature of its content and attachments, an email message
is little more than a string of bytes sent and received through sockets.
But also like FTP, Python has standard library modules to simplify all
aspects of email processing:

  • poplib
    and
    imaplib
    for
    fetching email

  • smtplib
    for sending
    email

  • The
    email
    module package for
    parsing email and constructing email

These modules are related: for nontrivial messages, we typically use
email
to parse mail text which has been
fetched with
poplib
and use
email
to compose mail text to be sent with
smtplib
. The
email
package also handles tasks such as address
parsing, date and time formatting, attachment formatting and extraction,
and encoding and decoding of email content (e,g, uuencode, Base64).
Additional modules handle more specific tasks (e.g.,
mimetypes
to map filenames to and from content
types).

In the next few sections, we explore the POP and SMTP interfaces for
fetching and sending email from and to servers, and the
email
package interfaces for parsing and
composing email message text. Other email interfaces in Python are
analogous and are documented in the Python library reference
manual.
[
51
]

Unicode in Python 3.X and Email Tools

In the prior sections of this
chapter, we studied how Unicode encodings can impact
scripts using Python’s
ftplib
FTP
tools in some depth, because it illustrates the implications of Python
3.X’s Unicode string model for real-world programming. In short:

  • All binary mode transfers should open local output and input
    files in binary mode (modes
    wb
    and
    rb
    ).

  • Text-mode downloads should open local output files in text
    mode with explicit encoding names (mode
    w
    , with an
    encoding
    argument that defaults to
    latin1
    within
    ftplib
    itself).

  • Text-mode uploads should open local input files in binary mode
    (mode
    rb
    ).

The prior sections describe why these rules are in force. The last
two points here differ for scripts written originally for Python 2.X. As
you might expect, given that the underlying sockets transfer byte
strings today, the email story is somewhat convoluted for Unicode in
Python 3.X as well. As a brief preview:

Fetching

The
poplib
module
returns fetched email text in
bytes
string form. Command text sent to
the server is encoded per
UTF8
internally, but replies are returned as raw binary
bytes
and not decoded into
str
text.

Sending

The
smtplib
module
accepts email content to send as
str
strings. Internally, message text
passed in
str
form is encoded
to binary
bytes
for
transmission using the
ascii
encoding scheme. Passing an already encoded
bytes
string to the send call may allow
more explicit control.

Composing

The
email
package
produces Unicode
str
strings containing plain text when
generating full email text for sending with
smtplib
and accepts optional encoding
specifications for messages and their parts, which it applies
according to email standard rules. Message headers may also be
encoded per email, MIME, and Unicode conventions.

Parsing

The
email
package in 3.1
currently requires raw email byte strings of the type fetched with
poplib
to be decoded into
Unicode
str
strings as
appropriate before it can be passed in to be parsed into a message
object. This pre-parse decoding might be done by a default, user
preference, mail headers inspection, or intelligent guess. Because
this requirement raises difficult issues for package clients, it
may be dropped in a future version of
email
and Python.

Navigating

The
email
package returns
most message components as
str
strings, though parts content decoded by Base64 and other email
encoding schemes may be returned as
bytes
strings, parts fetched without
such decoding may be
str
or
bytes
, and some
str
string parts are internally encoded
to
bytes
with scheme
raw-unicode-escape
before processing.
Message headers may be decoded by the package on request as
well.

If you’re migrating email scripts (or your mindset) from 2.X,
you’ll need to treat email text fetched from a server as byte strings,
and encode it before passing it along for parsing; scripts that send or
compose email are generally unaffected (and this may be the majority of
Python email-aware scripts), though content may have to be treated
specially if it may be returned as byte strings.

This is the story in Python 3.1, which is of course prone to
change over time. We’ll see how these email constraints translate into
code as we move along in this section. Suffice it to say, the text on
the Internet is not as simple as it used to be, though it probably
shouldn’t have been
anyhow.

[
51
]
IMAP, or Internet Message Access Protocol, was designed as an
alternative to POP, but it is still not as widely available today, and
so it is not presented in this text. For instance, major commercial
providers used for this book’s examples provide only POP (or
web-based) access to email. See the Python library manual for IMAP
server interface details. Python used to have a RFC822 module as well,
but it’s been subsumed by the
email
package in 3.X.

POP: Fetching Email

I confess: up
until just before 2000, I took a lowest-common-denominator
approach to email. I preferred to check my messages by
Telnetting to my
ISP and using a simple command-line email interface. Of
course, that’s not ideal for mail with attachments, pictures, and the
like, but its portability was staggering—because Telnet runs on almost any
machine with a network link, I was able to check my mail quickly and
easily from anywhere on the planet. Given that I make my living traveling
around the world teaching Python classes, this wild accessibility was a
big win.

As with website maintenance, times have changed on this front.
Somewhere along the way, most ISPs began offering web-based email access
with similar portability and dropped Telnet altogether. When my ISP took
away Telnet access, however, they also took away one of my main email
access methods. Luckily, Python came to the rescue again—by writing email
access scripts in Python, I could still read and send email from any
machine in the world that has Python and an Internet connection. Python
can be as portable a solution as Telnet, but much more powerful.

Moreover, I can still use these scripts as an alternative to tools
suggested by the ISP. Besides my not being fond of delegating control to
commercial products of large companies, closed email tools impose choices
on users that are not always ideal and may sometimes fail altogether. In
many ways, the motivation for coding Python email scripts is the same as
it was for the larger GUIs in
Chapter 11
:
the
scriptability
of Python programs can be a decided
advantage.

For example, Microsoft Outlook historically and by default has
preferred to download mail to your PC and delete it from the mail server
as soon as you access it. This keeps your email box small (and your ISP
happy), but it isn’t exactly friendly to people who travel and use
multiple machines along the way—once accessed, you cannot get to a prior
email from any machine except the one to which it was initially
downloaded. Worse, the web-based email interfaces offered by my ISPs have
at times gone offline completely, leaving me cut off from email (and
usually at the worst possible time).

The next two scripts represent one first-cut solution to such
portability and reliability constraints (we’ll see others in this and
later chapters). The first,
popmail.py
, is a simple
mail reader tool, which downloads and prints the contents of
each email in an email account. This script is admittedly primitive, but
it lets you read your email on any machine with Python and sockets;
moreover, it leaves your email intact on the server, and isn’t susceptible
to webmail outages. The second,
smtpmail.py
, is a
one-shot script for writing and sending a new email message that is as
portable as Python itself.

Later in this chapter, we’ll implement an interactive console-based
email client (pymail), and later in this book we’ll code a full-blown GUI
email tool (PyMailGUI) and a web-based email program of our own
(PyMailCGI). For now, we’ll start with the basics.

Mail Configuration Module

Before we get to the
scripts, let’s first take a look at a common module they
import and use. The module in
Example 13-17
is used to configure
email parameters appropriately for a particular user. It’s simply a
collection of assignments to variables used by mail programs that appear
in this book; each major mail client has its own version, to allow
content to vary. Isolating these configuration settings in this single
module makes it easy to configure the book’s email programs for a
particular user, without having to edit actual program logic
code.

If you want to use any of this book’s email programs to do mail
processing of your own, be sure to change its assignments to reflect
your servers, account usernames, and so on (as shown, they refer to
email accounts used for developing this book). Not all scripts use all
of these settings; we’ll revisit this module in later examples to
explain more of them.

Note that some ISPs may require that you be connected directly to
their systems in order to use their SMTP servers to send mail. For
example, when connected directly by dial-up in the past, I could use my
ISP’s server directly, but when connected via broadband, I had to route
requests through a cable Internet provider. You may need to adjust these
settings to match your configuration; see your ISP to obtain the
required POP and SMTP servers. Also, some SMTP servers check domain name
validity in addresses, and may require an authenticating login step—see
the SMTP section later in this chapter for interface details.

Example 13-17. PP4E\Internet\Email\mailconfig.py

"""
user configuration settings for various email programs (pymail/mailtools version);
email scripts get their server names and other email config options from this
module: change me to reflect your server names and mail preferences;
"""
#------------------------------------------------------------------------------
# (required for load, delete: all) POP3 email server machine, user
#------------------------------------------------------------------------------
popservername = 'pop.secureserver.net'
popusername = '[email protected]'
#------------------------------------------------------------------------------
# (required for send: all) SMTP email server machine name
# see Python smtpd module for a SMTP server class to run locally;
#------------------------------------------------------------------------------
smtpservername = 'smtpout.secureserver.net'
#------------------------------------------------------------------------------
# (optional: all) personal information used by clients to fill in mail if set;
# signature -- can be a triple-quoted block, ignored if empty string;
# address -- used for initial value of "From" field if not empty,
# no longer tries to guess From for replies: this had varying success;
#------------------------------------------------------------------------------
myaddress = '[email protected]'
mysignature = ('Thanks,\n'
'--Mark Lutz (http://learning-python.com, http://rmi.net/~lutz)')
#------------------------------------------------------------------------------
# (optional: mailtools) may be required for send; SMTP user/password if
# authenticated; set user to None or '' if no login/authentication is
# required; set pswd to name of a file holding your SMTP password, or
# an empty string to force programs to ask (in a console, or GUI);
#------------------------------------------------------------------------------
smtpuser = None # per your ISP
smtppasswdfile = '' # set to '' to be asked
#------------------------------------------------------------------------------
# (optional: mailtools) name of local one-line text file with your pop
# password; if empty or file cannot be read, pswd is requested when first
# connecting; pswd not encrypted: leave this empty on shared machines;
#------------------------------------------------------------------------------
poppasswdfile = r'c:\temp\pymailgui.txt' # set to '' to be asked
#------------------------------------------------------------------------------
# (required: mailtools) local file where sent messages are saved by some clients;
#------------------------------------------------------------------------------
sentmailfile = r'.\sentmail.txt' # . means in current working dir
#------------------------------------------------------------------------------
# (required: pymail, pymail2) local file where pymail saves pop mail on request;
#------------------------------------------------------------------------------
savemailfile = r'c:\temp\savemail.txt' # not used in PyMailGUI: dialog
#------------------------------------------------------------------------------
# (required: pymail, mailtools) fetchEncoding is the Unicode encoding used to
# decode fetched full message bytes, and to encode and decode message text if
# stored in text-mode save files; see Chapter 13 for details: this is a limited
# and temporary approach to Unicode encodings until a new bytes-friendly email
# package is developed; headersEncodeTo is for sent headers: see chapter13;
#------------------------------------------------------------------------------
fetchEncoding = 'utf8' # 4E: how to decode and store message text (or latin1?)
headersEncodeTo = None # 4E: how to encode non-ASCII headers sent (None=utf8)
#------------------------------------------------------------------------------
# (optional: mailtools) the maximum number of mail headers or messages to
# download on each load request; given this setting N, mailtools fetches at
# most N of the most recently arrived mails; older mails outside this set are
# not fetched from the server, but are returned as empty/dummy emails; if this
# is assigned to None (or 0), loads will have no such limit; use this if you
# have very many mails in your inbox, and your Internet or mail server speed
# makes full loads too slow to be practical; some clients also load only
# newly-arrived emails, but this setting is independent of that feature;
#------------------------------------------------------------------------------
fetchlimit = 25 # 4E: maximum number headers/emails to fetch on loads
POP Mail Reader Script

On to
reading email in Python: the script in
Example 13-18
employs Python’s
standard
poplib
module, an
implementation of the client-side interface to POP—the Post Office
Protocol. POP is a well-defined and widely available way to fetch email
from servers over sockets. This script connects to a POP server to
implement a simple yet portable email download and display tool.

Example 13-18. PP4E\Internet\Email\popmail.py

#!/usr/local/bin/python
"""
##############################################################################
use the Python POP3 mail interface module to view your POP email account
messages; this is just a simple listing--see pymail.py for a client with
more user interaction features, and smtpmail.py for a script which sends
mail; POP is used to retrieve mail, and runs on a socket using port number
110 on the server machine, but Python's poplib hides all protocol details;
to send mail, use the smtplib module (or os.popen('mail...')). see also:
imaplib module for IMAP alternative, PyMailGUI/PyMailCGI for more features;
##############################################################################
"""
import poplib, getpass, sys, mailconfig
mailserver = mailconfig.popservername # ex: 'pop.rmi.net'
mailuser = mailconfig.popusername # ex: 'lutz'
mailpasswd = getpass.getpass('Password for %s?' % mailserver)
print('Connecting...')
server = poplib.POP3(mailserver)
server.user(mailuser) # connect, log in to mail server
server.pass_(mailpasswd) # pass is a reserved word
try:
print(server.getwelcome()) # print returned greeting message
msgCount, msgBytes = server.stat()
print('There are', msgCount, 'mail messages in', msgBytes, 'bytes')
print(server.list())
print('-' * 80)
input('[Press Enter key]')
for i in range(msgCount):
hdr, message, octets = server.retr(i+1) # octets is byte count
for line in message: print(line.decode()) # retrieve, print all mail
print('-' * 80) # mail text is bytes in 3.x
if i < msgCount - 1:
input('[Press Enter key]') # mail box locked till quit
finally: # make sure we unlock mbox
server.quit() # else locked till timeout
print('Bye.')

Though primitive, this script illustrates the basics of reading
email in Python. To establish a connection to an email server, we start
by making an instance of the
poplib.POP3
object, passing in the email
server machine’s name as a string:

server = poplib.POP3(mailserver)

If this call doesn’t raise an exception, we’re connected (by
socket) to the POP server listening on POP port number 110 at the
machine where our email account lives.

The next thing we need to do before fetching messages is tell the
server our username and password; notice that the password method is
called
pass_
. Without the trailing
underscore,
pass
would name a
reserved word and trigger a syntax error:

server.user(mailuser)                      # connect, log in to mail server
server.pass_(mailpasswd) # pass is a reserved word

To keep things simple and relatively secure, this script always
asks for the account password interactively; the
getpass
module we met in the FTP section of
this chapter is used to input but not display a password string typed by
the user.

Once we’ve told the server our username and password, we’re free
to fetch mailbox information with the
stat
method (number messages, total bytes
among all messages) and fetch the full text of a particular message with
the
retr
method (pass the message
number—they start at 1). The full text includes all headers, followed by
a blank line, followed by the mail’s text and any attached parts. The
retr
call sends back a tuple that
includes a list of line strings representing the content of the
mail:

msgCount, msgBytes = server.stat()
hdr, message, octets = server.retr(i+1) # octets is byte count

We close the email server connection by calling the POP object’s
quit
method:

server.quit()                              # else locked till timeout

Notice that this call appears inside the
finally
clause of a
try
statement that wraps the bulk of the
script. To minimize complications associated with changes, POP servers
lock your email inbox between the time you first connect and the time
you close your connection (or until an arbitrary, system-defined timeout
expires). Because the POP
quit
method
also unlocks the mailbox, it’s crucial that we do this before exiting,
whether an exception is raised during email processing or not. By
wrapping the action in a
try
/
finally
statement, we guarantee that the
script calls
quit
on exit to unlock
the mailbox to make it accessible to other processes (e.g., delivery of
incoming email).

BOOK: Programming Python
3.21Mb size Format: txt, pdf, ePub
ads

Other books

Northwest of Earth by Moore, C.L.
Fowl Weather by Bob Tarte
Biker Chick by Dakota Knight
Erotic Encounters by Gentry, Samantha