Socket-level programming isn't our first choice for solving
client-server problems. Sockets are nicely supported by Python, however,
giving us a way to create a new protocol when the vast collection of
existing internetworking protocols are inadequate.
Client-server applications include a client-side program, a
server, a connection and a protocol for communication betweem the two
processes. One of the most popular and enduring suite of client-server
protocols is based on the Internetworking protocol: TCP/IP. For more
information in TCP/IP, see Internetworking with
TCP/IP [Comer95].
All of the TCP/IP protocols are based on the basic
socket. A socket is a handy metaphor for the way
that the Transport Control Protocol (TCP) reliably moves a stream of
bytes between two processes.
The socket
module includes a number of
functions to create and connect sockets. Once connected, a socket
behaves essentially like a file: it can be read from and written to.
When we are finished with a socket, we can close it, releasing the
network resources that were tied up by our processing.
When a client application communicates with a server, the client
does three things: it establishes the connection, it sends the request
and it reads the reply from the server. For some client-server
relationships, like a databsae server, there may be multiple requests
and replies. For other client-server requests, for example, the HTTP
protocol, a single request may involve a number of replies.
To establish a connection, the client needs two basic facts
about the server: the IP address and a port number. The IP address
identifies the specific computer (or host) that will handle the
request. The port number identifies the application program that will
process the request on that host. A typical host will respond to
requests on numerous ports. The port numbers prevent requests from
being sent to the wrong application program. Port numbers are defined
by several standards. Examples include FTP (port 21) and HTTP (port
80).
A client program makes requests to a server by using the
following outline of processing.
-
Develop the server's address. Fundamentally, an IP address is a 32-bit host number and a
16-bit port number. Since these are difficult to manage, a
variety of coding schemes are used. In Python, an address is a
2-tuple with a string and a number. The string representes the
IP address in dotted notation
("194.109.137.226"
) or as a domain name
("www.python.org"
); the number is the port
number from 0 to 65535.
-
Create a socket and connect it to this address. This is a series of function calls to the
socket
module. When this is complete, the
socket is connected to the remote IP address and port and the
server has accepted the connection.
-
Send the request. Many of the standard TCP/IP protocols expect the commands
to be sent as strings of text, terminated with the
\n
character. Often a Python file object is
created from the socket so that the complete set of file method
functions for reading and writing are available.
-
Read the reply. Many of the standard protocols will respond with a 3-digit
numeric code indicating the status of the request. We'll review
some common variations on these codes, below.
Developing an Address. An IP address is numeric. However, the Internet provides
domain names, via Domain Name Services (DNS).
This permits useful text names to be associated with numeric IP
addresses. We're more used to "www.python.org"
.
DNS resolves this to an IP address. The
socket
module provides functions for DNS name
resolution.
The most common operation in developing an address is decoding a
host name to create the numeric IP address. The
socket
module provides several functions for
working with host names and IP addresses.
-
gethostname
→ string
-
Returns the current host name.
-
gethostbyname
(
host
) → address
-
Returns the IP address (a string of the form
'255.255.255.255') for a host.
-
gethostbyaddr
(
address
) → (name, aliaslist,
addresslist)
-
Return the true host name, a list of aliases, and a list
of IP addresses, for a host. The host argument is a string
giving a host name or IP number.
-
getservbyname
(
servicename
,
protocolname
) → integer
-
Return a port number from a service name and protocol
name. The protocol name should be 'tcp' or 'udp'.
Typically, the socket.gethostbyname
function is used to develop the IP address of a specific server name.
It does this by makig a DNS inquiry to transform the host name into an
IP address.
Port Numbers. The port number is usually defined by your application. For
instance, the FTP application uses port number 21. Port numbers from
0 to 1023 are assigned by RFC 1700 standard and are called the
well known ports. Port numbers from 1024 to
49151 are available to be registered for use by specific
applications. The Internet Assigned Numbers Authority (IANA) tracks
these assigned port numbers. See https://www.iana.org/assignments/port-numbers.
You can use the private port numbers, from 49152 to 65535, without
fear of running into any conflicts. Port numbers above 1024 may
conflict with installed software on your host, but are generally
safe.
Port numbers below 1024 are restricted so that only priviledged
programs can use them. This means that you must have root or
administrator access to run a program which provides services on one
of these ports. Consequently, many application programs which are not
run by root, but run by ordinary users, will use port numbers starting
with 1024.
It is very common to use ports from 8000 and above for services
that don't require root or administrator privileges to run.
Technically, port 8000 has a defined use, and that use has nothing to
do with HTTP. Port 8008 and 8080 are the official alternatives to port
80, used for developing web applications. However, port 8000 is often
used for web applications.
The usual approach is to have a standard port number for your
application, but allow users to override this in the event of
conflicts. This can be a command-line parameter or it can be in a
configuration file.
Generally, a client program must accept an IP address as a
command-line parameter. A network is a dynamic thing: computers are
brought online and offline constantly. A "hard-wired" IP address is an
inexcusable mistake.
Create and Connect a Socket. A socket is one end of a network connection. Data passes
bidirectionally through a socket between client and server. The
socket
module defines the
SocketType
, which is the class for all
sockets. The socket
function creates a socket
object.
-
socket
(
family
,
type
, [
protocol
]) → SocketType
-
Open a socket of the given type. The
family
argument specifies the address
family; it is normally socket.AF_INET
. The
type
argument specifies whether this is a
TCP/IP stream (socket.SOCK_STREAM
) or UDP/IP
datagram (socket.SOCK_DGRAM
) socket. The
protocol argument is not used for standard TCP/IP or
UDP/IP.
A SocketType
object has a number of
method functions. Some of these are relevant for server-side
processing and some for client-side processing. The client side method
functions for establishing a connection include the following. In each
definition, the variable s
is a socket
object.
-
s.
connect
(
address
)
-
Connect the socket to a remote address; the address is
usually a (host address, port #) tuple. In the event of a
problem, this will raise an exception.
-
s.
connect_ex
(
address
) → integer
-
Connect the socket to a remote address; the address is
usually a (host address, port #) tuple. This will return an
error code instead of raising an exception. A value of 0 means
success.
-
s.
fileno
→
integer
-
Return underlying file descriptor, usable by the
select
module or the
os.read
and os.write
functions.
-
s.
getpeername
→
address
-
Return the remote address bound to this socket; not
supported on all platforms.
-
s.
getsockname
→
address
-
Return the local address bound to this socket.
-
s.
getsockopt
(
level
,
opt
, [
buflen
] ) → string
-
Get socket options. See the UNIX man pages for more
information. The level is usually SOL_SOCKET
.
The option names all begin with SO_
and are
defined in the module. You will have to use the
struct
module to decode results.
-
s.
setblocking
(
flag
)
-
Set or clear the blocking I/O flag.
-
s.
setsockopt
(
level
,
opt
,
value
)
-
Set socket options. See the UNIX man pages for more
information. The
level
is usual
SOL_SOCKET
. The option names all begin with
SO_
and are defined in the module. You will
have to use the struct
module to encode
parameters.
-
s.
shutdown
(
how
)
-
Shutdown traffic on this socket. If how is 0, receives are
disallowed; if how is 1, sends are disallowed. Usually this is 2
to disallow both reads and writes. Generally, this should be
done before the close
.
-
s.
close
-
Close the socket. It's usually best to use the
shutdown
method before closing the
socket.
Sending the Request and Receiving the Reply. Sending requests and processing replies is done by writing to
the socket and reading data from the socket. Often, the response
processing is done by reading the file
object
that is created by a socket's makefile
method.
Since the value returned by makefile
is a
conventional file, then readlines
and
writelines
methods can be used on this file
object.
A SocketType
object has a number of
method functions. Some of these are relevant for server-side
processing and some for client-side processing. The client side method
functions for sending (and receiving) data include the following. In
each definition, the variable s
is a socket
object.
-
s.
recv
(
bufsize
, [
flags
]
) → string
-
Receive data, limited by
bufsize
.
flags
are MSG_OOB
(read out-of-band data) or MSG_PEEK
(examine
the data without consuming it; a subsequent
recv
will read the data again).
-
s.
recvfrom
(
bufsize
, [
flags
]
) → ( string, address )
-
Receive data and sender's address, arguments are the same
as recv
.
-
s.
send
(
string
, [
flags
]
) → ( string, address )
-
Send data to a connected socket. The
MSG_OOB
flag is supported for sending
out-of-band data.
-
s.
sendto
(
string
, [
flags
, ]
address
) → integer
-
Send data to a given address, using an unconnected socket.
The flags option is the same as send
.
Return value is the number of bytes actually sent.
-
s.
makefile
(
mode
, [
bufsize
]
) → file
-
Return a file object corresponding to this socket. The
mode
and
bufsize
options are the same as used in the built in
file
function.
Example. The following examples show a simple client application using
the socket
module.
This is the Client
class
definition.
#!/usr/bin/env python
import socket
class Client( object ):
rbufsize= -1
wbufsize= 0
def __init__( self, address=('localhost',7000) ):
self.server=socket.socket( socket.AF_INET, socket.SOCK_STREAM )
self.server.connect( address )
self.rfile = self.server.makefile('rb', self.rbufsize)
self.wfile = self.server.makefile('wb', self.wbufsize)
def makeRequest( self, text ):
"""send a message and get a 1-line reply"""
self.wfile.write( text + '\n' )
data= self.rfile.read()
self.server.close()
return data
print "Connecting to Echo Server"
c= Client()
response= c.makeRequest( "Greetings" )
print repr(response)
print "Finished"
A Client
object is initialized with a
specific server name. The host ("localhost"
) and
port number (8000
) are default values in the class
__init__
function. The address of
"localhost"
is handy for testing a client and a
server on your PC. First the socket is created, then it is bound to an
address. If no exceptions are raised, then an input and output file
are created to use this socket.
The makeRequet
function sends a message and
then reads the reply.
When a server program starts, it creates a socket on which it
listens for requests. The server has a three-step response to a
client. First, it accepts the connection, then it reads and processes
the client's request. Finally, it sends a reply to the client. For
some client-server relationships, like a database server, there may be
multiple requests and replies. Since database requests may take a long
time to process, the server must be multi-threaded in order to handle
concurrent requests. In the case of HTTP, a single request will lead
to multiple replies.
A server program handles requests from a client by using the
following outline of processing.
-
Create a Listener Socket. A listener socket is waiting for client connection
requests.
-
Accept a Client Connection. When a client attempts a connection, the socket's
accept
method will return a "daughter"
socket connected to the client. This daughter socket is used for
all subsequent processing.
-
Read the request. Many of the standard TCP/IP protocols expect the commands
to be sent as strings of text, terminated with the
\n
character. Often a Python file object is
created from the socket so that the complete set of file method
functions for reading and writing are available.
-
Send the reply. Many of the standard protocols will respond with a 3-digit
numeric code indicating the status of the request. We'll review
some common variations on these codes, below.
Create and Listen on a Socket. The following methods are relevant when creating server-side
sockets. These server side method functions are used for
establishing the public socket that is waiting for client
connections. In each definition, the variable s
is a socket object.
-
s.
bind
(
address
)
-
Bind the socket to a local address tuple of ( IP Address
and port number ). This tuple is the address and port that will
be used by clients to connect with this server. Generally, the
first part of the tuple is simply "" to indicate that this
server uses the address of the computer on which it is
running.
-
s.
listen
(
queueSize
)
-
Start listening for incoming connections, queueSize
specifies the number of queued connections.
-
s.
accept
→ (
socket, address )
-
Accept a client connection, returning a socket connected
to the client and client address.
Once the socket connection has been accepted, processing is a
simple matter of reading and writing on the daughter socket.
We won't show an example of writing a server program using
simple sockets. The best way to make use of server-side sockets is to
use the SocketServer
module.
Practical Server Programs with SocketServer
Generally, we use the SocketServer
module
for simple socket processing. Usually, we create a
TCPSocket
using this module. This can simplify
the processing of requests and replies. The SocketServer module, for
example, is the basis for the SimpleHTTPServer
(see the section called “Web Servers and the HTTP protocol”) and
SimpleXMLRPCServer
(see the section called “Web Services: The xmlrpclib
Module”) modules.
Much of server-side processing is encapsulated in two classes of
the SocketServer
module. You will subclass the
StreamRequestHandler
class to process TCP/IP
requests. This subclass will include the methods that do the essential
work of the program.
You will then create an instance of the
TCPServer
class and give it your
RequestHandler
subclass. The instance of
TCPServer
will to manage the public socket, and
all of the basic processing. For each connection, it will create an
instance of your subclass of
StreamRequestHandler
to handle the
connection.
Define a RequestHandler
. Defining a handler is done by creating a subclass of
StreamRequestHandler
or
BaseRequestHandler
and adding a
handle
method function. The
BaseRequestHandler
defines a simple framework
that TCPServer
can use when data is received
on a socket.
Generally, we use a subclass of
StreamRequestHandler
. This class has methods
that create files from the socket. This alliows the
handle
method function to simply read and write
files. Specifically, the superclass will assure that the variables
self.rfile
and self.wfile
are
available.
For example, the echo service runs in port 7. The echo service
simply reads the data provided in the socket, and echoes it back to
the sender. Many Linux boxes have this service enabled by default. We
can build the basic echo handler by creating a subclass of
StreamRequestHandler
.
#!/usr/bin/env python
"""My Echo"""
import SocketServer
class EchoHandler( SocketServer.StreamRequestHandler ):
def handle(self):
input= self.request.recv(1024)
print "Input: %r" % ( input, )
self.request.send("Heard: %r\n" % ( input, ) )
server= SocketServer.TCPServer( ("",7000), EchoHandler )
print "Starting Server"
server.serve_forever()
This class can be used by a TCPServer
instance to handle requests. In this, the
TCPServer
instance named
server
creates an instance of
EchoHandler
each time a connection is made on
port 7. The derived socket is given to the handler instance, as the
instance variable self.request
.
A more sophisticated handler might decode input commands and
perform unique processing for each command. For example, if we were
building an on-line Roulette server, there might be three basic
commands: a place bet command, a show status command and a spin the
wheel command. There might be additional commands to join a table,
chat with other players, perform credit checks, etc.
Methods of TCPServer. In order to process requests, there are two methods of a
TCPServer
that are of interest. In the
following examples the TCPServer
instance is
the variable s
.
-
s.
handle_request
-
Handle a single request: wait for input, create the
handler object to process the request.
-
s.
serve_forever
-
Handle requests in an infinite loop. Runs until the loop
is broken with an exception.
Generally, basic web services do almost everything we need; and
they do this kind of thing in a simple and standard way. Using sockets
is done either to invent something knew or to cope with something very
old. Generally, using web services is a better choice than inventing
your own protocol.
If you can't, for some reason, make suitable use of web
services, here are some lessons gleaned from the reading the
Internetworking Requests for Comments (RFCs).
Many protocols involve a request-reply conversational style. The
client connects to the server and makes requests. The server replies
to each request. Some protocols (for example, FTP) may involve a long
conversation. Other protocols (for example, HTTP) involve a single
request and (sometimes) a single reply. Many web sites leverate HTTP's
ability to send multiple replies, but some web sites send a single,
tidy response.
Many of the Internet standard requests are short 1- to
4-character commands. The syntax is kept intentionally very simple,
using spaces for delimeters. Complex syntax with optional clauses and
sophisticated punctuation is often an aid for people. In most web
protocols, a sequence of simple commands are used instead of a single,
complex statement.
The responses are often 3-digit numbers plus explanatory
comments. The application depends on the 3-digit number. The
explanatory comments can be written to a log or displayed for a human
user. The status numbers are often coded as follows:
-
1yz
-
Preliminary reply, more replies will follow.
-
2yz
-
Completed.
-
3yz
-
More information required. This is typically the start of
a dialog.
-
4yz
-
Request not completed; trying again makes sense. This is a
transient problem like a deadlock, timeout, or file system
problem.
-
5yz
-
Request not completed because it's in error; trying again
doesn't make sense. This a syntax problem or other error with
the request.
The middle digit within the response provides some additional
information.
-
x0z
-
The response message is syntax-related.
-
x1z
-
The response message is informational.
-
x2z
-
The response message is about the connection.
-
x3z
-
The response message is about accounting or
authentication.
-
x5z
-
The response message is file-system related.
These codes allow a program to specify multi-part replies using
1
yz
codes. The status of a client-server dialog
is managed with 3
yz
codes that request additional
information. 4
yz
codes are problems that might
get fixed. 5
yz
codes are problems that can never
be fixed (the request doesn't make sense, has illegal options,
etc.)
Note that protocols like FTP (RFC 959) provide a useful
convention for handling multi-line replies: the first line has a
-
after the status number to indicate that additional
lines follow; each subsequent lines are indented. The final line
repeats the status number. This rule allows us to detect the first of
many lines, and absorb all lines until the matching status number is
read.