Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (116 page)

BOOK: Programming Python
12.15Mb size Format: txt, pdf, ePub
ads
Refactoring Uploads and Downloads for Reuse

The directory upload and download scripts of the prior two
sections work as advertised and, apart from the
mimetypes
logic, were the only FTP examples
that were included in the second edition of this book. If you look at
these two scripts long enough, though, their similarities will pop out
at you eventually. In fact, they are largely the same—they use identical
code to configure transfer parameters, connect to the FTP server, and
determine file type. The exact details have been lost to time, but some
of this code was certainly copied from one file to the other.

Although such redundancy isn’t a cause for alarm if we never plan
on changing these scripts, it can be a killer in software projects in
general. When you have two copies of identical bits of code, not only is
there a danger of them becoming out of sync over time (you’ll lose
uniformity in user interface and behavior), but you also effectively
double your effort when it comes time to change code that appears in
both places. Unless you’re a big fan of extra work, it pays to avoid
redundancy wherever possible.

This redundancy is especially glaring when we look at the complex
code that uses
mimetypes
to determine
file types. Repeating magic like this in more than one place is almost
always a bad idea—not only do we have to remember how it works every
time we need the same utility, but it is a recipe for errors.

Refactoring with functions

As originally coded,
our download and upload scripts comprise top-level
script code that relies on global variables. Such a structure is
difficult to reuse—code runs immediately on imports, and it’s
difficult to generalize for varying contexts. Worse, it’s difficult to
maintain—when you program by cut-and-paste of existing code, you
increase the cost of future changes every time you click the Paste
button.

To demonstrate how we might do better,
Example 13-12
shows one way to
refactor
(reorganize) the download script. By
wrapping its parts in functions, they become reusable in other
modules, including our upload program.

Example 13-12. PP4E\Internet\Ftp\Mirror\downloadflat_modular.py

#!/bin/env python
"""
##############################################################################
use FTP to copy (download) all files from a remote site and directory
to a directory on the local machine; this version works the same, but has
been refactored to wrap up its code in functions that can be reused by the
uploader, and possibly other programs in the future - else code redundancy,
which may make the two diverge over time, and can double maintenance costs.
##############################################################################
"""
import os, sys, ftplib
from getpass import getpass
from mimetypes import guess_type, add_type
defaultSite = 'home.rmi.net'
defaultRdir = '.'
defaultUser = 'lutz'
def configTransfer(site=defaultSite, rdir=defaultRdir, user=defaultUser):
"""
get upload or download parameters
uses a class due to the large number
"""
class cf: pass
cf.nonpassive = False # passive FTP on by default in 2.1+
cf.remotesite = site # transfer to/from this site
cf.remotedir = rdir # and this dir ('.' means acct root)
cf.remoteuser = user
cf.localdir = (len(sys.argv) > 1 and sys.argv[1]) or '.'
cf.cleanall = input('Clean target directory first? ')[:1] in ['y','Y']
cf.remotepass = getpass(
'Password for %s on %s:' % (cf.remoteuser, cf.remotesite))
return cf
def isTextKind(remotename, trace=True):
"""
use mimetype to guess if filename means text or binary
for 'f.html, guess is ('text/html', None): text
for 'f.jpeg' guess is ('image/jpeg', None): binary
for 'f.txt.gz' guess is ('text/plain', 'gzip'): binary
for unknowns, guess may be (None, None): binary
mimetype can also guess name from type: see PyMailGUI
"""
add_type('text/x-python-win', '.pyw') # not in tables
mimetype, encoding = guess_type(remotename, strict=False) # allow extras
mimetype = mimetype or '?/?' # type unknown?
maintype = mimetype.split('/')[0] # get first part
if trace: print(maintype, encoding or '')
return maintype == 'text' and encoding == None # not compressed
def connectFtp(cf):
print('connecting...')
connection = ftplib.FTP(cf.remotesite) # connect to FTP site
connection.login(cf.remoteuser, cf.remotepass) # log in as user/password
connection.cwd(cf.remotedir) # cd to directory to xfer
if cf.nonpassive: # force active mode FTP
connection.set_pasv(False) # most servers do passive
return connection
def cleanLocals(cf):
"""
try to delete all locals files first to remove garbage
"""
if cf.cleanall:
for localname in os.listdir(cf.localdir): # local dirlisting
try: # local file delete
print('deleting local', localname)
os.remove(os.path.join(cf.localdir, localname))
except:
print('cannot delete local', localname)
def downloadAll(cf, connection):
"""
download all files from remote site/dir per cf config
ftp nlst() gives files list, dir() gives full details
"""
remotefiles = connection.nlst() # nlst is remote listing
for remotename in remotefiles:
if remotename in ('.', '..'): continue
localpath = os.path.join(cf.localdir, remotename)
print('downloading', remotename, 'to', localpath, 'as', end=' ')
if isTextKind(remotename):
# use text mode xfer
localfile = open(localpath, 'w', encoding=connection.encoding)
def callback(line): localfile.write(line + '\n')
connection.retrlines('RETR ' + remotename, callback)
else:
# use binary mode xfer
localfile = open(localpath, 'wb')
connection.retrbinary('RETR ' + remotename, localfile.write)
localfile.close()
connection.quit()
print('Done:', len(remotefiles), 'files downloaded.')
if __name__ == '__main__':
cf = configTransfer()
conn = connectFtp(cf)
cleanLocals(cf) # don't delete if can't connect
downloadAll(cf, conn)

Compare this version with the original. This script, and every
other in this section, runs the same as the original flat download and
upload programs. Although we haven’t changed its behavior, though,
we’ve modified the script’s software structure
radically—
its code is now a set of
tools
that can be imported and reused in other
programs.

The refactored upload program in
Example 13-13
, for instance, is now
noticeably simpler, and the code it shares with the download script
only needs to be changed in one place if it ever requires
improvement.

Example 13-13. PP4E\Internet\Ftp\Mirror\uploadflat_modular.py

#!/bin/env python
"""
##############################################################################
use FTP to upload all files from a local dir to a remote site/directory;
this version reuses downloader's functions, to avoid code redundancy;
##############################################################################
"""
import os
from downloadflat_modular import configTransfer, connectFtp, isTextKind
def cleanRemotes(cf, connection):
"""
try to delete all remote files first to remove garbage
"""
if cf.cleanall:
for remotename in connection.nlst(): # remote dir listing
try: # remote file delete
print('deleting remote', remotename) # skips . and .. exc
connection.delete(remotename)
except:
print('cannot delete remote', remotename)
def uploadAll(cf, connection):
"""
upload all files to remote site/dir per cf config
listdir() strips dir path, any failure ends script
"""
localfiles = os.listdir(cf.localdir) # listdir is local listing
for localname in localfiles:
localpath = os.path.join(cf.localdir, localname)
print('uploading', localpath, 'to', localname, 'as', end=' ')
if isTextKind(localname):
# use text mode xfer
localfile = open(localpath, 'rb')
connection.storlines('STOR ' + localname, localfile)
else:
# use binary mode xfer
localfile = open(localpath, 'rb')
connection.storbinary('STOR ' + localname, localfile)
localfile.close()
connection.quit()
print('Done:', len(localfiles), 'files uploaded.')
if __name__ == '__main__':
cf = configTransfer(site='learning-python.com', rdir='books', user='lutz')
conn = connectFtp(cf)
cleanRemotes(cf, conn)
uploadAll(cf, conn)

Not only is the upload script simpler now because it reuses
common code, but it will also inherit any changes made in the download
module. For instance, the
isTextKind
function was later augmented with
code that adds the
.pyw
extension to
mimetypes
tables (this file type is not
recognized by default); because it is a shared function, the change is
automatically picked up in the upload program, too.

This script and the one it imports achieve the same goals as the
originals, but changing them for easier code maintenance is a
big deal
in the real world of software
development. The following, for example, downloads the site from one
server and uploads to
another:

C:\...\PP4E\Internet\Ftp\Mirror>
python downloadflat_modular.py test
Clean target directory first?
Password for lutz on home.rmi.net:
connecting...
downloading 2004-longmont-classes.html to test\2004-longmont-classes.html as text
...lines omitted...
downloading relo-feb010-index.html to test\relo-feb010-index.html as text
Done: 297 files downloaded.
C:\...\PP4E\Internet\Ftp\Mirror>
python uploadflat_modular.py test
Clean target directory first?
Password for lutz on learning-python.com:
connecting...
uploading test\2004-longmont-classes.html to 2004-longmont-classes.html as text
...lines omitted...
uploading test\zopeoutline.htm to zopeoutline.htm as text
Done: 297 files uploaded.
Refactoring with classes

The function-based
approach of the last two examples addresses the
redundancy issue, but they are perhaps clumsier than they need to be.
For instance, their
cf
configuration options object provides a namespace that replaces global
variables and breaks cross-file dependencies. Once we start making
objects to model namespaces, though, Python’s OOP support tends to be
a more natural structure for our code. As one last twist,
Example 13-14
refactors the FTP code
one more time in order to leverage Python’s class feature.

Example 13-14. PP4E\Internet\Ftp\Mirror\ftptools.py

#!/bin/env python
"""
##############################################################################
use FTP to download or upload all files in a single directory from/to a
remote site and directory; this version has been refactored to use classes
and OOP for namespace and a natural structure; we could also structure this
as a download superclass, and an upload subclass which redefines the clean
and transfer methods, but then there is no easy way for another client to
invoke both an upload and download; for the uploadall variant and possibly
others, also make single file upload/download code in orig loops methods;
##############################################################################
"""
import os, sys, ftplib
from getpass import getpass
from mimetypes import guess_type, add_type
# defaults for all clients
dfltSite = 'home.rmi.net'
dfltRdir = '.'
dfltUser = 'lutz'
class FtpTools:
# allow these 3 to be redefined
def getlocaldir(self):
return (len(sys.argv) > 1 and sys.argv[1]) or '.'
def getcleanall(self):
return input('Clean target dir first?')[:1] in ['y','Y']
def getpassword(self):
return getpass(
'Password for %s on %s:' % (self.remoteuser, self.remotesite))
def configTransfer(self, site=dfltSite, rdir=dfltRdir, user=dfltUser):
"""
get upload or download parameters
from module defaults, args, inputs, cmdline
anonymous ftp: user='anonymous' pass=emailaddr
"""
self.nonpassive = False # passive FTP on by default in 2.1+
self.remotesite = site # transfer to/from this site
self.remotedir = rdir # and this dir ('.' means acct root)
self.remoteuser = user
self.localdir = self.getlocaldir()
self.cleanall = self.getcleanall()
self.remotepass = self.getpassword()
def isTextKind(self, remotename, trace=True):
"""
use mimetypes to guess if filename means text or binary
for 'f.html, guess is ('text/html', None): text
for 'f.jpeg' guess is ('image/jpeg', None): binary
for 'f.txt.gz' guess is ('text/plain', 'gzip'): binary
for unknowns, guess may be (None, None): binary
mimetypes can also guess name from type: see PyMailGUI
"""
add_type('text/x-python-win', '.pyw') # not in tables
mimetype, encoding = guess_type(remotename, strict=False)# allow extras
mimetype = mimetype or '?/?' # type unknown?
maintype = mimetype.split('/')[0] # get 1st part
if trace: print(maintype, encoding or '')
return maintype == 'text' and encoding == None # not compressed
def connectFtp(self):
print('connecting...')
connection = ftplib.FTP(self.remotesite) # connect to FTP site
connection.login(self.remoteuser, self.remotepass) # log in as user/pswd
connection.cwd(self.remotedir) # cd to dir to xfer
if self.nonpassive: # force active mode
connection.set_pasv(False) # most do passive
self.connection = connection
def cleanLocals(self):
"""
try to delete all local files first to remove garbage
"""
if self.cleanall:
for localname in os.listdir(self.localdir): # local dirlisting
try: # local file delete
print('deleting local', localname)
os.remove(os.path.join(self.localdir, localname))
except:
print('cannot delete local', localname)
def cleanRemotes(self):
"""
try to delete all remote files first to remove garbage
"""
if self.cleanall:
for remotename in self.connection.nlst(): # remote dir listing
try: # remote file delete
print('deleting remote', remotename)
self.connection.delete(remotename)
except:
print('cannot delete remote', remotename)
def downloadOne(self, remotename, localpath):
"""
download one file by FTP in text or binary mode
local name need not be same as remote name
"""
if self.isTextKind(remotename):
localfile = open(localpath, 'w', encoding=self.connection.encoding)
def callback(line): localfile.write(line + '\n')
self.connection.retrlines('RETR ' + remotename, callback)
else:
localfile = open(localpath, 'wb')
self.connection.retrbinary('RETR ' + remotename, localfile.write)
localfile.close()
def uploadOne(self, localname, localpath, remotename):
"""
upload one file by FTP in text or binary mode
remote name need not be same as local name
"""
if self.isTextKind(localname):
localfile = open(localpath, 'rb')
self.connection.storlines('STOR ' + remotename, localfile)
else:
localfile = open(localpath, 'rb')
self.connection.storbinary('STOR ' + remotename, localfile)
localfile.close()
def downloadDir(self):
"""
download all files from remote site/dir per config
ftp nlst() gives files list, dir() gives full details
"""
remotefiles = self.connection.nlst() # nlst is remote listing
for remotename in remotefiles:
if remotename in ('.', '..'): continue
localpath = os.path.join(self.localdir, remotename)
print('downloading', remotename, 'to', localpath, 'as', end=' ')
self.downloadOne(remotename, localpath)
print('Done:', len(remotefiles), 'files downloaded.')
def uploadDir(self):
"""
upload all files to remote site/dir per config
listdir() strips dir path, any failure ends script
"""
localfiles = os.listdir(self.localdir) # listdir is local listing
for localname in localfiles:
localpath = os.path.join(self.localdir, localname)
print('uploading', localpath, 'to', localname, 'as', end=' ')
self.uploadOne(localname, localpath, localname)
print('Done:', len(localfiles), 'files uploaded.')
def run(self, cleanTarget=lambda:None, transferAct=lambda:None):
"""
run a complete FTP session
default clean and transfer are no-ops
don't delete if can't connect to server
"""
self.connectFtp()
cleanTarget()
transferAct()
self.connection.quit()
if __name__ == '__main__':
ftp = FtpTools()
xfermode = 'download'
if len(sys.argv) > 1:
xfermode = sys.argv.pop(1) # get+del 2nd arg
if xfermode == 'download':
ftp.configTransfer()
ftp.run(cleanTarget=ftp.cleanLocals, transferAct=ftp.downloadDir)
elif xfermode == 'upload':
ftp.configTransfer(site='learning-python.com', rdir='books', user='lutz')
ftp.run(cleanTarget=ftp.cleanRemotes, transferAct=ftp.uploadDir)
else:
print('Usage: ftptools.py ["download" | "upload"] [localdir]')

In fact, this last mutation combines uploads and downloads into
a single file, because they are so closely related. As before, common
code is factored into methods to avoid redundancy. New here, the
instance object itself becomes a natural namespace for storing
configuration options (they become
self
attributes). Study this example’s code
for more details of the restructuring applied.

Again, this revision runs the same as our original site download
and upload scripts; see its self-test code at the end for usage
details, and pass in a command-line argument to specify “download” or
“upload.” We haven’t changed what it does, we’ve refactored it for
maintainability and reuse:

C:\...\PP4E\Internet\Ftp\Mirror>
ftptools.py download test
Clean target dir first?
Password for lutz on home.rmi.net:
connecting...
downloading 2004-longmont-classes.html to test\2004-longmont-classes.html as text
...lines omitted...
downloading relo-feb010-index.html to test\relo-feb010-index.html as text
Done: 297 files downloaded.
C:\...\PP4E\Internet\Ftp\Mirror>
ftptools.py upload test
Clean target dir first?
Password for lutz on learning-python.com:
connecting...
uploading test\2004-longmont-classes.html to 2004-longmont-classes.html as text
...lines omitted...
uploading test\zopeoutline.htm to zopeoutline.htm as text
Done: 297 files uploaded.

Although this file can still be run as a command-line script
like this, its class is really now a package of FTP tools that can be
mixed into other programs and reused. By wrapping its code in a class,
it can be easily customized by redefining its methods—its
configuration calls, such as
getlocaldir
, for example, may be redefined
in subclasses for custom scenarios.

Perhaps most importantly, using classes optimizes code
reusability. Clients of this file can both upload and download
directories by simply subclassing or embedding an instance of this
class and calling its methods. To see one example of how, let’s move
on to the next
section.

BOOK: Programming Python
12.15Mb size Format: txt, pdf, ePub
ads

Other books

Texas Wildcat by Lindsay McKenna
Wild Child by M Leighton
Dream of You by Kate Perry
The Last Card by Kolton Lee
79 Park Avenue by Harold Robbins
The Clockwork Crown by Beth Cato
Trouble in Texas by Katie Lane
Soul-Bonded to the Alien by Serena Simpson