This client-server model of programming is very powerful and
adaptable. It is powerful because it makes giant, centralized servers
available to large numbers of remote, widely distributed users. It is
adaptable because we don't need to send software to everyone's computer to
make a change to the centralized service.
Essentially, every client-server application involves a client
application program, a server application, and a protocol for
communication betweem the two processes. In most cases, these protocols
are part of the popular and enduring suite of internetworking protocols
based on TCP/IP. For more information in TCP/IP, see
Internetworking with TCP/IP
[Comer95].
The essence of TCP/IP is a multi-layered view of the world. This
view separates the mechanics of operating a simple Local Area Network
(LAN) from the interconnection between networks, called
internetworking
.
The lowest level of network services are provided by mechanisms
like Ethernet (see the IEEE 802.3 standards) that covers wiring between
computers. The Ethernet standards include things like 10BaseT (for
twisted pairs of thin wires), 10Base2 (for thicker coaxial cabling).
Network services may also be wireless, using the IEEE 802.11 standards.
In all cases, though, these network services provide for simple naming
of devices and moving bits from device to device. These services are
limited by having to know the hardware name of the receiving device;
usually called the MAC address. When you buy a new network card for your
computer, you change your computer's hardware name.
The TCP/IP standards put several layers of control on top of these
data passing mechanisms. While these additional layers allow
interconnection between networks, they also provide a standard library
for using all of the various kinds of network hardware that is
available. First, the Internet Protocol (IP) standard specifies
addresses that are independent of the underlying hardware. The IP also
breaks messages into packets and reassembles the packets in order to be
independent of any network limitations on transmission lengths. The IP
standard specifies how to handle errors. Additionally, the IP standard
specifies how to route packets among networks, allowing packets to pass
over bridges and routers between networks. Finally, IP provides a formal
Network Interface Layer to divorce IP and all higher level protocols
from the mechanics of the actual network.
The Transport Control Protocol (TCP) protocol relies on IP. It
provides a reliable stream of bytes from one application process to
another. It does this by breaking the data into packets and using IP to
route those packets from source to receiver. It also uses IP to send
status information and retry lost or corrupted packets. TCP keeps
complete control so that the bytes that are sent are recieved exactly
once and in the correct order.
Many applications, in turn, depend on the TCP/IP protocol
capabilities. The Hypertext Transport Protocol (HTTP), used to view a
web page, works by creating a TCP/IP connection (called a
socket) between browser and web server. A request
is sent from browser to web server. The web server responds to the
browser request. When the web page content is complete, the socket is
closed and the socket connection can be discarded.
Python provides a number of complete client protocols that are
built on TCP/IP in the following modules: urllib
,
httplib
, ftplib
,
gopherlib
, poplib
,
imaplib
, nntplib
,
smtplib
, telnetlib
. Each
of these exploits one or more protocols in the TCP/IP family, including
HTTP, FTP, GOPHER, POP, IMAP, NNTP, SMTP and Telnet. The
urllib
and urllib2
modules
make use of multiple protocols, including HTTP and FTP, which are
commonly provided by web servers.
We'll start with the high-level procotols: HTTP and how this
serves web pages for people. We'll look at using this to create a web
service, also. Then we'll look at lower-level protocols like FTP.
Finally, we'll look at how Python deals with the low-level socket
abstraction for network communications. Then we'll look at some
higher-level modules that depend on sockets implicitly.