Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (120 page)

BOOK: Programming Python
4.41Mb size Format: txt, pdf, ePub
ads
Sending Email at the Interactive Prompt

So where
are we in the Internet abstraction model now? With all
this email fetching and sending going on, it’s easy to lose the forest
for the trees. Keep in mind that because mail is transferred over
sockets (remember sockets?), they are at the root of all this activity.
All email read and written ultimately consists of formatted bytes
shipped over sockets between computers on the Net. As we’ve seen,
though, the POP and SMTP interfaces in Python hide all the details.
Moreover, the scripts we’ve begun writing even hide the Python
interfaces and provide higher-level interactive tools.

Both the
popmail
and
smtpmail
scripts provide portable email tools
but aren’t quite what we’d expect in terms of usability these days.
Later in this chapter, we’ll use what we’ve seen thus far to implement a
more interactive, console-based mail tool. In the next chapter, we’ll
also code a tkinter email GUI, and then we’ll go on to build a web-based
interface in a later chapter. All of these tools, though, vary primarily
in terms of user interface only; each ultimately employs the Python mail
transfer modules we’ve met here to transfer mail message text over the
Internet with sockets.

Before we move on, one more SMTP note: just as for reading mail,
we can use the Python interactive prompt as our email sending client,
too, if we type calls manually. The following, for example, sends a
message through my ISP’s SMTP server to two recipient addresses assumed
to be part of a mail list:

C:\...\PP4E\Internet\Email>
python
>>>
from smtplib import SMTP
>>>
conn = SMTP('smtpout.secureserver.net')
>>>
conn.sendmail(
...
'[email protected]',
# true sender
...
['[email protected]', '[email protected]'],
# true recipients
...
"""From: [email protected]
...
To: maillist
...
Subject: test interactive smtplib
...
...
testing 1 2 3...
...
""")
{}
>>>
conn.quit()
# quit() required, Date added
(221, b'Closing connection. Good bye.')

We’ll verify receipt of this message in a later email client
program; the “To” recipient shows up as “maillist” in email clients—a
completely valid use case for header manipulation. In fact, you can
achieve the same effect with the
smtpmail-noTo
script by separating recipient
addresses at the “To?” prompt with a semicolon (e.g.
[email protected]
;
[email protected]
)
and typing the email list’s name in the “To:” header line. Mail clients
that support mailing lists automate such steps.

Sending mail interactively this way is a bit tricky to get right,
though—header lines are governed by standards: the blank line after the
subject line is required and significant, for instance, and Date is
omitted altogether (one is added for us). Furthermore, mail formatting
gets much more complex as we start writing messages with attachments. In
practice, the
email
package in the
standard library is generally used to construct emails, before shipping
them off with
smtplib
. The package
lets us build mails by assigning headers and attaching and possibly
encoding parts, and creates a correctly formatted mail text. To learn
how, let’s move on to the next
section.

[
52
]
We all know by now that such junk mail is usually referred to
as spam, but not everyone knows that this name is a reference to a
Monty Python skit in which a restaurant’s customers find it
difficult to hear the reading of menu options over a group of
Vikings singing an increasingly loud chorus of “spam, spam, spam…”.
Hence the tie-in to junk email. Spam is used in Python program
examples as a sort of generic variable name, though it also pays
homage to the skit.

email: Parsing and Composing Mail Content

The second
edition of this book used a handful of standard library
modules (
rfc822
,
StringIO
, and more) to parse the contents of
messages, and simple text processing to compose them. Additionally, that
edition included a section on extracting and decoding attached parts of a
message using modules such as
mhlib
,
mimetools
, and
base64
.

In the third edition, those tools were still available, but were,
frankly, a bit clumsy and error-prone. Parsing attachments from messages,
for example, was tricky, and composing even basic messages was tedious (in
fact, an early printing of the prior edition contained a potential bug,
because it omitted one
\n
character in
a string formatting operation). Adding attachments to sent messages wasn’t
even attempted, due to the complexity of the formatting involved. Most of
these tools are gone completely in Python 3.X as I write this fourth
edition, partly because of their complexity, and partly because they’ve
been made obsolete.

Luckily, things are much simpler today. After the second edition,
Python sprouted a new
email
package—a powerful
collection of tools that automate most of the work behind parsing and
composing email messages. This module gives us an object-based message
interface and handles all the textual message structure details, both
analyzing and creating it. Not only does this eliminate a whole class of
potential bugs, it also promotes more advanced mail processing.

Things like attachments, for instance, become accessible to mere
mortals (and authors with limited book real estate). In fact, an entire
original section on manual attachment parsing and decoding was deleted in
the third edition—it’s essentially automatic with
email
. The new package parses and constructs
headers and attachments; generates correct email text; decodes and encodes
Base64, quoted-printable, and
uuencoded
data; and much more.

We won’t cover the
email
package
in its entirety in this book; it is well documented in Python’s library
manual. Our goal here is to explore some example usage code, which you can
study in conjunction with the manuals. But to help get you started, let’s
begin with a quick overview. In a nutshell, the
email
package is based around the
Message
object it provides:

Parsing mail

A mail’s full text, fetched from
poplib
or
imaplib
, is parsed into a new
Message
object, with an API for accessing
its components. In the object, mail headers become dictionary-like
keys, and components become a “payload” that can be walked with a
generator interface (more on payloads in a moment).

Creating mail

New mails are composed by creating a new
Message
object, using an API to attach
headers and parts, and asking the object for its print
representation—a correctly formatted mail message text, ready to be
passed to the
smtplib
module for
delivery. Headers are added by key assignment and attachments by
method calls.

In other words, the
Message
object is used both for accessing existing messages and for creating new
ones from scratch. In both cases,
email
can automatically handle details like content encodings (e.g., attached
binary images can be treated as text with Base64 encoding and decoding),
content types, and more.

Message Objects

Since the
email
module’s
Message
object is at the heart of its API, you need a cursory
understanding of its form to get started. In short, it is designed to
reflect the structure of a formatted email message. Each
Message
consists of three main pieces of
information
:

Type

A content type (plain text, HTML text, JPEG image, and so
on), encoded as a MIME main type and a subtype. For instance,
“text/html” means the main type is text and the subtype is HTML (a
web page); “image/jpeg” means a JPEG photo. A “multipart/mixed”
type means there are nested parts within the message.

Headers

A dictionary-like mapping interface, with one key per mail
header (From, To, and so on). This interface supports almost all
of the usual dictionary operations, and headers may be fetched or
set by normal key indexing.

Content

A “payload,” which represents the mail’s content. This can
be either a string (
bytes
or
str
) for simple messages, or a
list of additional
Message
objects for
multipart
container messages with attached or alternative parts. For some
oddball types, the payload may be a Python
None
object.

The MIME type of a Message is key to understanding its content.
For example, mails with attached images may have a main top-level
Message
(type
multipart/mixed
), with three more
Message
objects in its payload—one for its
main text (type
text/plain
), followed
by two of type image for the photos (type
image/jpeg
). The photo parts may be encoded
for transmission as text with Base64 or another scheme; the encoding
type, as well as the original image filename, are specified in the
part’s headers.

Similarly, mails that include both simple text and an HTML
alternative will have two nested
Message
objects in their payload, of type
plain text (
text/plain
) and HTML text
(
text/html
), along with a main root
Message
of type
multipart/alternative
. Your mail client
decides which part to display, often based on your preferences.

Simpler messages may have just a root
Message
of type
text/plain
or
text/html
, representing the entire message
body. The payload for such mails is a simple string. They may also have
no explicitly given type at all, which generally defaults to
text/plain
. Some single-part messages are
text/html
, with no
text/plain
alternative—they require a web
browser or other HTML viewer (or a very keen-eyed user).

Other combinations are possible, including some types that are not
commonly seen in practice, such as
message/delivery
status. Most messages have a
main text part, though it is not required, and may be nested in a
multipart or other construct.

In all cases, an email message is a simple, linear string, but
these message structures are automatically detected when mail text is
parsed and are created by your method calls when new messages are
composed. For instance, when creating messages, the message
attach
method adds parts for multipart mails,
and
set_payload
sets the entire
payload to a string for simple mails.

Message
objects also have
assorted properties (e.g., the filename of an attachment), and they
provide a convenient
walk
generator
method, which returns the next
Message
in the payload each time through in a
for
loop or other iteration context.
Because the walker yields the root
Message
object first (i.e.,
self
), single-part messages don’t have to be
handled as a special case; a nonmultipart message is effectively a
Message
with a single item in its
payload—itself.

Ultimately, the
Message
object
structure closely mirrors the way mails are formatted as text. Special
header lines in the mail’s text give its type (e.g., plain text or
multipart), as well as the separator used between the content of nested
parts. Since the underlying textual details are automated by the
email
package—both when parsing and
when composing—we won’t go into further formatting details here.

If you are interested in seeing how this translates to real
emails, a great way to learn mail structure is by inspecting the full
raw text of messages displayed by email clients you already use, as
we’ll see with some we meet in this book. In fact, we’ve already seen a
few—see the raw text printed by our earlier POP email scripts for simple
mail text examples. For more on the
Message
object, and
email
in general, consult the
email
package’s entry in Python’s library
manual. We’re skipping details such as its available encoders and MIME
object classes here in the interest of space.

Beyond the
email
package, the
Python library includes other tools for mail-related processing. For
instance,
mimetypes
maps a filename
to and from a MIME type:

mimetypes.guess_type(filename)

Maps a
filename to a MIME type. Name
spam.txt
maps to text/plan.

mimetypes.guess_extension(contype)

Maps a MIME
type to a filename extension. Type text/html maps to
.html
.

We also used the
mimetypes
module earlier in this chapter to guess FTP transfer modes from
filenames (see
Example 13-10
),
as well as in
Chapter 6
, where we used
it to guess a media player for a filename (see the examples there,
including
playfile.py
,
Example 6-23
). For email, these can
come in handy when attaching files to a new message (
guess_type
) and saving parsed attachments that
do not provide a filename (
guess_extension
). In fact, this module’s
source code is a fairly complete reference to MIME types. See the
library manual for more on these
tools.

Basic email Package Interfaces in Action

Although we can’t
provide an exhaustive reference here, let’s step through a
simple interactive session to illustrate the fundamentals of email
processing. To
compose
the full text of a
message—to be delivered with
smtplib
,
for instance—make a
Message
, assign
headers to its keys, and set its payload to the message body. Converting
to a string yields the mail text. This process is substantially simpler
and less error-prone than the manual text operations we used earlier in
Example 13-19
to build mail as
strings:

>>>
from email.message import Message
>>>
m = Message()
>>>
m['from'] = 'Jane Doe '
>>>
m['to'] = '[email protected]'
>>>
m.set_payload('The owls are not what they seem...')
>>>
>>>
s = str(m)
>>>
print(s)
from: Jane Doe
to: [email protected]
The owls are not what they seem...

Parsing
a message’s text—like the kind you
obtain with
poplib
—is similarly
simple, and essentially the inverse: we get back a
Message
object from the text, with keys for
headers and a payload for the body:

>>>
s
# same as in prior interaction
'from: Jane Doe \nto: [email protected]\n\nThe owls are not...'
>>>
from email.parser import Parser
>>>
x = Parser().parsestr(s)
>>>
x

>>>
>>>
x['From']
'Jane Doe '
>>>
x.get_payload()
'The owls are not what they seem...'
>>>
x.items()
[('from', 'Jane Doe '), ('to', '[email protected]')]

So far this isn’t much different from the older and
now-defunct
rfc822
module, but as
we’ll see in a moment, things get more interesting when there is more
than one part. For simple messages like this one, the message
walk
generator treats it as a single-part
mail, of type plain text:

>>>
for part in x.walk():
...
print(x.get_content_type())
...
print(x.get_payload())
...
text/plain
The owls are not what they seem...
Handling multipart messages

Making a mail with
attachments
is a little
more work, but not much: we just make a root
Message
and attach nested
Message
objects created from the MIME type
object that corresponds to the type of data we’re attaching. The
MIMEText
class, for instance, is a
subclass of
Message
, which is
tailored for text parts, and knows how to generate the right types of
header information when printed.
MIMEImage
and
MIMEAudio
similarly customize Message for
images and audio, and also know how to apply Base64 and other MIME
encodings to binary data. The root message is where we store the main
headers of the mail, and we attach parts here, instead of setting the
entire payload—the payload is a list now, not a string.
MIMEMultipart
is a
Message
that provides the extra header
protocol we need for the root:

>>>
from email.mime.multipart import MIMEMultipart
# Message subclasses
>>>
from email.mime.text import MIMEText
# with extra headers+logic
>>>
>>>
top = MIMEMultipart()
# root Message object
>>>
top['from'] = 'Art '
# subtype default=mixed
>>>
top['to'] = '[email protected]'
>>>
>>>
sub1 = MIMEText('nice red uniforms...\n')
# part Message attachments
>>>
sub2 = MIMEText(open('data.txt').read())
>>>
sub2.add_header('Content-Disposition', 'attachment', filename='data.txt')
>>>
top.attach(sub1)
>>>
top.attach(sub2)

When we ask for the text, a correctly formatted full mail text
is returned, separators and all, ready to be sent with
smtplib
—quite a trick, if you’ve ever tried
this by hand:

>>>
text = top.as_string()
# or do: str(top) or print(top)
>>>
print(text)
Content-Type: multipart/mixed; boundary="===============1574823535=="
MIME-Version: 1.0
from: Art
to: [email protected]
--===============1574823535==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
nice red uniforms...
--===============1574823535==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="data.txt"
line1
line2
line3
--===============1574823535==--

If we are sent this message and retrieve it via
poplib
, parsing its full text yields a
Message
object
just like the one we built to send. The message
walk
generator allows us to step through
each part, fetching their types and payloads:

>>>
text
# same as in prior interaction
'Content-Type: multipart/mixed; boundary="===============1574823535=="\nMIME-Ver...'
>>>
from email.parser import Parser
>>>
msg = Parser().parsestr(text)
>>>
msg['from']
'Art '
>>>
for part in msg.walk():
...
print(part.get_content_type())
...
print(part.get_payload())
...
print()
...
multipart/mixed
[,
]
text/plain
nice red uniforms...
text/plain
line1
line2
line3

Multipart alternative messages (with text and HTML renditions of
the same message) can be composed and parsed in similar fashion.
Because
email
clients are able to
parse and compose messages with a simple object-based API, they are
freed to focus on user-interface instead of text
processing.

BOOK: Programming Python
4.41Mb size Format: txt, pdf, ePub
ads

Other books

Wild Boy by Andy Taylor
Through the Hole by Kendall Newman
Honeymoon from Hell Part I by R.L. Mathewson
The Longest Second by Bill S. Ballinger
Shield's Submissive by Trina Lane
The Devil's Interval by J. J. Salkeld
Geek Chic by Lesli Richardson
All for You by Laura Florand
Heat by Bill Streever