Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (32 page)

BOOK: Programming Python
13.29Mb size Format: txt, pdf, ePub
ads
Output stream buffering revisited: Deadlocks and
flushes

The two processes of
the prior section’s example engage in a simple dialog,
but it’s already enough to illustrate some of the dangers lurking in
cross-program communications. First of all, notice that both programs
need to write to
stderr
to display
a message; their
stdout
streams are
tied to the other program’s input stream. Because processes share file
descriptors,
stderr
is
the same in both parent and child, so status messages
show up in the same place.

More subtly, note that both parent and child call
sys.stdout.flush
after they print text to
the output stream. Input requests on pipes normally block the caller
if no data is available, but it seems that this shouldn’t be a problem
in our example because there are as many writes as there are reads on
the other side of the pipe. By default, though,
sys.stdout
is
buffered
in this context, so the
printed text may not actually be transmitted until some time in the
future (when the output buffers fill up). In fact, if the flush calls
are not made, both processes may get stuck on some platforms waiting
for input from the other—input that is sitting in a buffer and is
never flushed out over the pipe. They wind up in a
deadlock
state, both blocked on
input
calls waiting for events that never
occur.

Technically, by default
stdout
is just
line-buffered
when connected to a terminal, but
it is
fully buffered
when connected to other
devices such as files, sockets, and the pipes used here. This is why
you see a script’s printed text in a shell window immediately as it is
produced, but not until the process exits or its buffer fills when its
output stream is connected to something else.

This output buffering is really a function of the system
libraries used to access pipes, not of the pipes themselves (pipes do
queue up output data, but they never hide it from readers!). In fact,
it appears to occur in this example only because we copy the pipe’s
information over to
sys.stdout
, a
built-in file object that uses stream buffering by default. However,
such anomalies can also occur when using other cross-process
tools.

In general terms, if your programs engage in a two-way dialog
like this, there are a variety of ways to avoid buffering-related
deadlock problems:

  • Flushes
    : As
    demonstrated in Examples
    5-22
    and
    5-23
    , manually flushing output pipe
    streams by calling the file object
    flush
    method is an easy way to force
    buffers to be cleared. Use
    sys.stdout.flush
    for the output stream
    used by
    print
    .

  • Arguments
    : As introduced earlier in
    this chapter, the
    -u
    Python
    command-line flag turns off full buffering for the
    sys.stdout
    stream in Python programs.
    Setting your
    PYTHONUNBUFFERED
    environment variable to a nonempty value is equivalent to passing
    this flag but applies to every program run.

  • Open modes
    : It’s possible to
    use pipes themselves in unbuffered mode. Either use
    low-level
    os
    module calls to
    read and write pipe descriptors directly, or pass a buffer size
    argument of
    0
    (for
    unbuffered
    ) or
    1
    (for
    line-buffered
    ) to
    os.fdopen
    to disable buffering in the
    file object used to wrap the descriptor. You can use
    open
    arguments the same way to control
    buffering for output to
    fifo files
    (described
    in the next section). Note that in Python 3.X, fully unbuffered
    mode is allowed only for binary mode files, not text.

  • Command pipes
    : As
    mentioned earlier in this chapter, you can similarly
    specify buffering mode arguments for command-line pipes when they
    are created by
    os.popen
    and
    subprocess.Popen
    , but this
    pertains to the caller’s end of the pipe, not those of the spawned
    program. Hence it cannot prevent delayed outputs from the latter,
    but can be used for text sent to another program’s input
    pipe.

  • Sockets
    : As we’ll see
    later, the
    socket.makefile
    call accepts a similar buffering mode argument for sockets
    (described later in this chapter and book), but in Python 3.X this
    call requires buffering for text-mode access and appears to not
    support line-buffered mode (more on this on
    Chapter 12
    ).

  • Tools
    : For more complex tasks, we can
    also use higher-level tools that essentially fool a program into
    believing it is connected to a terminal. These address programs
    not written in Python, for which neither manual flush calls nor
    -u
    are an option. See
    More on Stream Buffering: pty and Pexpect
    .

Thread can avoid blocking a main GUI, too, but really just
delegate the problem (the spawned thread will still be deadlocked). Of
the options listed, the first two—manual flushes and command-line
arguments—are often the simplest solutions. In fact, because it is so
useful, the second technique listed above merits a few more words. Try
this: comment-out all the
sys.stdout.flush
calls in Examples
5-22
and
5-23
(the files
pipes.py
and
pipes-testchild.py
) and change the parent’s spawn
call in
pipes.py
to this (i.e., add a
-u
command-line argument):

spawn('python', '-u', 'pipes-testchild.py', 'spam')

Then start the program with a command line like this:
python -u pipes.py
. It will work as it did
with the manual
stdout
flush calls,
because
stdout
will be operating in
unbuffered mode in both parent and child.

We’ll revisit the effects of unbuffered output streams in
Chapter 10
, where we’ll code a simple GUI that
displays the output of a non-GUI program by reading it over both a
nonblocking socket and a pipe in a thread. We’ll explore the topic
again in more depth in
Chapter 12
, where we
will redirect standard streams to sockets in more general ways.
Deadlock in general, though, is a bigger problem than we have space to
address fully here. On the other hand, if you know enough that you
want to do IPC in Python, you’re probably already a veteran of the
deadlock wars.

Anonymous pipes allow related tasks to communicate but are not
directly suited for independently launched programs. To allow the
latter group to converse, we need to move on to the next section and
explore devices that have broader
visibility.

More on Stream Buffering: pty and Pexpect

On Unix-like platforms, you
may also be able to use the Python
pty
standard library module to force
another program’s standard output to be unbuffered, especially if
it’s not a Python program and you cannot change its code.

Technically, default buffering for
stdout
in other programs is determined
outside Python by whether the underlying file descriptor refers to a
terminal. This occurs in the
stdio
file system library and cannot be
controlled by the spawning program. In general, output to terminals
is line buffered, and output to nonterminals (including files,
pipes, and sockets) is fully buffered. This policy is used for
efficiency. Files and streams created within a Python script follow
the same defaults, but you can specify buffering policies in
Python’s file creation tools.

The
pty
module essentially
fools the spawned program into thinking it is connected to a
terminal so that only one line is buffered for
stdout
. The net effect is that each
newline flushes the prior line—typical of interactive programs, and
what you need if you wish to grab each piece of the printed output
as it is produced.

Note, however, that the
pty
module is not required for this role when spawning Python scripts
with pipes: simply use the
-u
Python command-line flag, pass line-buffered mode arguments to file
creation tools, or manually call
sys.stdout.flush()
in the spawned program.
The
pty
module is also not
available on all Python platforms today (most notably, it runs on
Cygwin but not the standard Windows Python).

The
Pexpect
package,
a pure-Python equivalent of the Unix
expect
program, uses
pty
to provide additional functionality
and to handle interactions that bypass standard streams (e.g.,
password inputs). See the Python library manual for more on
pty
, and search the Web for
Pexpect
.

Named Pipes (Fifos)

On some platforms,
it is also possible to create a long-lived pipe that
exists as a real named file in the filesystem. Such files are called
named pipes (or, sometimes,
fifos
) because they
behave just like the pipes created by the previous section’s programs.
Because fifos are associated with a real file on your computer, though,
they are external to any particular program—they do not rely on memory
shared between tasks, and so they can be used as an IPC mechanism for
threads, processes, and independently launched programs.

Once a named pipe file is created, clients open it by name and
read and write data using normal file operations. Fifos are
unidirectional streams. In typical operation, a server program reads
data from the fifo, and one or more client programs write data to it. In
addition, a set of two fifos can be used to implement bidirectional
communication just as we did for anonymous pipes in the prior
section.

Because fifos reside in the filesystem, they are longer-lived than
in-process anonymous pipes and can be accessed by programs started
independently. The unnamed, in-
process
pipe examples thus far depend on
the fact that file descriptors (including pipes) are copied to child
processes’ memory. That makes it difficult to use anonymous pipes to
connect programs started independently. With fifos, pipes are accessed
instead by a filename visible to all programs running on the computer,
regardless of any parent/child process relationships. In fact, like
normal files, fifos typically outlive the programs that access them.
Unlike normal files, though, the operating system synchronizes fifo
access, making them ideal for IPC.

Because of their distinctions, fifo pipes are better suited as
general IPC mechanisms for independent client and server programs. For
instance, a perpetually running server program may create and listen for
requests on a fifo that can be accessed later by arbitrary clients not
forked by the server. In a sense, fifos are an alternative to the socket
port interface we’ll meet in the next section. Unlike sockets, though,
fifos do not directly support remote network connections, are not
available in standard Windows Python today, and are accessed using the
standard file interface instead of the more unique socket port numbers
and calls we’ll study later.

Named pipe basics

In Python, named
pipe files are created with the
os.mkfifo
call,
which is available today on Unix-like platforms, including Cygwin’s
Python on Windows, but is not currently available in standard Windows
Python. This call creates only the external file, though; to send and
receive data through a fifo, it must be opened and processed as if it
were a standard file.

To illustrate,
Example 5-24
is a derivation of the
pipe2.py
script listed in
Example 5-20
, but rewritten here to
use fifos rather than anonymous pipes. Much like
pipe2.py
, this script opens the fifo using
os.open
in the child for low-level
byte string access, but with the
open
built-in in the parent to treat the
pipe as text; in general, either end may use either technique to treat
the pipe’s data as bytes or text.

Example 5-24. PP4E\System\Processes\pipefifo.py

"""
named pipes; os.mkfifo is not available on Windows (without Cygwin);
there is no reason to fork here, since fifo file pipes are external
to processes--shared fds in parent/child processes are irrelevent;
"""
import os, time, sys
fifoname = '/tmp/pipefifo' # must open same name
def child():
pipeout = os.open(fifoname, os.O_WRONLY) # open fifo pipe file as fd
zzz = 0
while True:
time.sleep(zzz)
msg = ('Spam %03d\n' % zzz).encode() # binary as opened here
os.write(pipeout, msg)
zzz = (zzz+1) % 5
def parent():
pipein = open(fifoname, 'r') # open fifo as text file object
while True:
line = pipein.readline()[:-1] # blocks until data sent
print('Parent %d got "%s" at %s' % (os.getpid(), line, time.time()))
if __name__ == '__main__':
if not os.path.exists(fifoname):
os.mkfifo(fifoname) # create a named pipe file
if len(sys.argv) == 1:
parent() # run as parent if no args
else: # else run as child process
child()

Because the fifo exists independently of both parent and child,
there’s no reason to fork here. The child may be started independently
of the parent as long as it opens a fifo file by the same name. Here,
for instance, on Cygwin the parent is started in one shell window and
then the child is started in another. Messages start appearing in the
parent window only after the child is started and begins writing
messages onto the
fifo file:

[C:\...\PP4E\System\Processes] $
python pipefifo.py
# parent window
Parent 8324 got "Spam 000" at 1268003696.07
Parent 8324 got "Spam 001" at 1268003697.06
Parent 8324 got "Spam 002" at 1268003699.07
Parent 8324 got "Spam 003" at 1268003702.08
Parent 8324 got "Spam 004" at 1268003706.09
Parent 8324 got "Spam 000" at 1268003706.09
Parent 8324 got "Spam 001" at 1268003707.11
Parent 8324 got "Spam 002" at 1268003709.12
Parent 8324 got "Spam 003" at 1268003712.13
Parent 8324 got "Spam 004" at 1268003716.14
Parent 8324 got "Spam 000" at 1268003716.14
Parent 8324 got "Spam 001" at 1268003717.15
...etc: Ctrl-C to exit...
[C:\...\PP4E\System\Processes]$
file /tmp/pipefifo
# child window
/tmp/pipefifo: fifo (named pipe)
[C:\...\PP4E\System\Processes]$
python pipefifo.py -child
...Ctrl-C to exit...
Named pipe use cases

By mapping
communication points to a file system entity accessible
to all programs run on a machine, fifos can address a broad range of
IPC goals on platforms where they are supported. For instance,
although this section’s example runs independent programs, named pipes
can also be used as an IPC device by both in-process threads and
directly forked related processes, much as we saw for anonymous pipes
earlier.

By also supporting unrelated programs, though, fifo files are
more widely applicable to general client/server models. For example,
named pipes can make the GUI and
command
-line debugger integration I
described earlier for anonymous pipes even more flexible—by using fifo
files to connect the GUI to the non-GUI debugger’s streams, the GUI
could be started independently when needed.

Sockets provide similar functionality but also buy us both
inherent network awareness and broader portability to Windows—as the
next section
explains.

BOOK: Programming Python
13.29Mb size Format: txt, pdf, ePub
ads

Other books

The Awesome by Eva Darrows
Web of Lies by Beverley Naidoo
Rus Like Everyone Else by Bette Adriaanse
The Wind and the Spray by Joyce Dingwell
The Fix by Nick Earls
August Unknown by Fryer, Pamela
Devoured By Darkness by Alexandra Ivy
Andre by V. Vaughn