It will help you understand how cookies work if you see real
programs use them. So we will examine a CGI program that displays
two forms, and that stores the information returned by calling the
cookie server. Here is the first form:
The ACTION attribute specifies the next
form in the series as a query string. The filename is relative to
the document root directory.
The string "-*Cookie*-" will be replaced by a random cookie
identifier when this form is parsed by the CGI program. This cookie
is used to uniquely identify the form information.
Here is the second form in the series. It should be stored
in a file named location.html because that
name was specified in the ACTION attribute of
the first form.
Since this is the last form in the series, no query information
is passed to the program.
We will do something unusual in this example by not looking
at the program that handles these programs right away. Instead,
we will examine the cookie server--the continuously running program
that maintains state for CGI programs. Then, we will return to the
program that parses the forms--the cookie client--and see how it interacts
with the server.
Here I will show a general purpose server for CGI programs
running on the local systems. Each CGI program is a cookie client.
When it connects, this server enters a long loop accepting commands,
as we will see in a moment. Please note that this is not a CGI script.
Instead, it provides a data storage service for CGI scripts.
#!/usr/local/bin/perl
require "sockets.pl";
srand (time|$$);
The srand
function sets the random number seed. A logical OR of the current
time and the process identification number (PID) creates a very
good seed.
$HTTP_server = "128.197.27.7";
The IP address of the HTTP server from
where the CGI scripts will connect to this server is specified.
This is used to prevent CGI programs running on other HTTP
servers on the Web to communicate with this server.
$separator = "\034";
$expire_time = 15 * 60;
The expire_time variable sets the time
(in seconds) for which a cookie is valid. In this case, a cookie
is valid for 15 minutes.
%DATA = ();
$max_cookies = 10;
$no_cookies = 0;
The DATA associative array is
used to hold
the form information. The max_cookies variable
sets the limit for the number of cookies that can be active at one
time. And the no_cookies variable is a counter
that keeps track of the number of active cookies.
$error = 500;
$success = 200;
These two variables hold the status codes for error and success,
respectively.
$port = 5000;
&listen_to_port (SOCKET, $port) || die "Cannot create socket.", "\n";
The listen_to_port
function is part of the socket library. It "listens" on the specified
port for possible connections. In this case, port number 5000 is
used. However, if you do not know what port to set the server on,
you can ask the socket library to do it for you:
( ($port) = &listen_to_port (SOCKET) ) || die "Cannot create socket.", "\n";
print "The Cookie Server is running on port number: $port", "\n";
If the listen_to_port function is called
in this manner (with one argument), an empty port is selected. You
will then have to modify the cookie client (see the next section)
to reflect the correct port number. Or, you can ask your system
administrator to create an entry in the /etc/services
file for the cookie server, after which the client can simply use
the name "cookie" to refer to the server.
while (1) {
( ($ip_name, $ip_address) = &accept_connection (COOKIE, SOCKET) )
|| die "Could not accept connection.", "\n";
This starts an infinite loop that continually accepts connections.
When a connection is established, a new socket handle, COOKIE,
is created to deal with it, while the original file handle, SOCKET,
goes back to accept more connections. The accept_connection
subroutine returns the IP name and address of the remote host. In
our case, this will always point to the address of the HTTP
server, because the CGI program (or the client) is being executed
from that server.
This cookie server, as implemented, can only "talk" to one
connection at a time. All other connections are queued up, and handled
in the order in which they are received. (Later on, we'll discuss
how to implement a server that can handle multiple connections simultaneously.)
select (COOKIE);
$cookie = undef;
The default output file handle is set to COOKIE.
The cookie variable is used to hold the current
cookie identifier.
if ($ip_address ne $HTTP_server) {
&print_status ($error, "You are not allowed to connect to server.");
If the IP address of the remote host does not match the address
of the HTTP server, the connection is coming
from a host somewhere else. We do not want servers running on other
hosts connecting to this server and storing information, which could
result in a massive system overload! However, you can set this up
so that all machines within your domain can access this server to
store information.
} else {
&print_status ($success, "Welcome from $ip_name ($ip_address)");
A welcome message is displayed if the connection is coming
from the right place (our HTTP server). The print_status
subroutine simply outputs the status number and the message to standard
output.
while (<COOKIE>) {
s/[\000-\037]//g;
s/^\s*(.*)\b\s*/$1/;
The while loop accepts input from the socket continuously.
All control characters, as well as leading and trailing spaces,
are removed from the input. This server accepts the following commands:
new remote-address
cookie cookie-identifier remote-address
key = value
list
delete
We will discuss each of these in a moment.
if ( ($remote_address) = /^new\s*(\S+)$/) {
The new
command creates a new and unique cookie and outputs it to the socket.
The remote address of the host that is connected to the HTTP
server should be passed as an argument to this command. This makes
it difficult for intruders to break the server, as you will see
in a minute. Here is an example of how this command is used, and
its typical output (with the client's command in bold):
new www.test.net
200: 13fGK7KIlZSF2
The status along with a unique cookie identifier is output.
The client should parse this line, get the cookie, and insert it
in the form, either as a query or a hidden variable.
if ($cookie) {
&print_status ($error,
"You already have a cookie!");
If the cookie variable is defined, an
error message is displayed. This would only occur if you try to
call the new command multiple times in the
same session.
} else {
if ($no_cookies >= $max_cookies) {
&print_status ($error,
"Cookie limit reached.");
} else {
do {
$cookie = &generate_new_cookie
($remote_address);
} until (!$DATA{$cookie});
If a cookie is not defined for this session, and the number
of cookies is not over the pre-defined limit, the generate_new_cookie
subroutine is called to create a unique cookie.
$no_cookies++;
$DATA{$cookie} = join("::", $remote_address,
$cookie, time);
&print_status ($success, $cookie);
}
}
Once a cookie is successfully created, the counter is incremented,
and a new key is inserted into the DATA associative
array. The value for this key is a string containing the remote
address (so we can check against it later), the cookie, and the
time (for expiration purposes).
} elsif ( ($check_cookie, $remote_address) =
/^cookie\s*(\S+)\s*(\S+)/) {
The cookie command sets the cookie for
the session. Once you set a cookie, you can store information, list
the stored information, and delete the cookie. The cookie command
is generally used once you have a valid cookie (by using the new
command). Here is a typical cookie command:
cookie 13fGK7KIlZSF2 www.test.net
200: Cookie 13fGK7KIlZSF2 set.
The server will return a status indicating either success
or failure. If you try to set a cookie that does not exist, you
will get the following error message:
cookie 6bseVEbhf74 www.test.net
500: Cookie does not exist.
And if the IP address is not the same as the one that was
used when creating the cookie, this is what is displayed:
cookie 13fGK7KIlZSF2 www.joe.net
500: Incorrect IP address.
The program continues:
if ($cookie) {
&print_status ($error, "You already specified a cookie.");
If the cookie command is specified multiple
times in a session, an error message is output.
} else {
if ($DATA{$check_cookie}) {
($old_address) = split(/::/, $DATA{$check_cookie});
if ($old_address ne $remote_address) {
&print_status ($error, "Incorrect IP address.");
} else {
$cookie = $check_cookie;
&print_status ($success, "Cookie $cookie set.");
}
} else {
&print_status ($error, "Cookie does not exist.");
}
}
If the cookie exists, the specified address is compared to
the original IP address. If everything is valid, the cookie
variable will contain the cookie.
} elsif ( ($variable, $value) = /^(\w+)\s*=\s*(.*)$/) {
The regular expression checks for a statement that contains
a key and a value that is used to store the information.
Here is a sample session where two variables are stored:
cookie 13fGK7KIlZSF2 www.test.net
200: Cookie 13fGK7KIlZSF2 set.
name = Joe Test
200: name=Joe Test
organization = Test Net
200: organization=Test Net
The server is stringent, and allows only variables composed
of alphanumeric characters (A-Z, a-z, 0-9, _).
if ($cookie) {
$key = join ($separator, $cookie, $variable);
$DATA{$key} = $value;
&print_status ($success, "$variable=$value");
} else {
&print_status ($error, "You must specify a cookie.");
}
The variable name is concatenated with the cookie and the
separator to create the key for the associative array.
} elsif (/^list$/) {
if ($cookie) {
foreach $key (keys %DATA) {
$string = join ("", $cookie, $separator);
if ( ($variable) = $key =~ /^$string(.*)$/) {
&print_status ($success, "$variable=$DATA{$key}");
}
}
print ".", "\n";
} else {
&print_status ($error, "You don't have a cookie yet.");
}
The
list
command displays all of the stored information by iterating through
the DATA associative array. Only keys that contain
the separator are output. In other words, the initial key containing
the cookie, the remote address, and the time is not displayed. Here
is the output from a list command:
cookie 13fGK7KIlZSF2 www.test.net
200: Cookie 13fGK7KIlZSF2 set.
list
200: name=Joe Test
200: organization=Test Net
.
The data ends with the "." character, so that the client can
stop reading at that point and an infinite loop is not created.
} elsif (/^delete$/) {
if ($cookie) {
&remove_cookie ($cookie);
&print_status ($success, "Cookie $cookie deleted.");
} else {
&print_status ($error, "Select a cookie to delete.");
}
The delete
command removes the cookie from its internal database. The remove_cookie
subroutine is called to remove all information associated with the
cookie. Here is an example that shows the effect of the delete
command:
cookie 13fGK7KIlZSF2 www.test.net
200: Cookie 13fGK7KIlZSF2 set.
list
200: name=Joe Test
200: organization=Test Net
.
delete
200: Cookie 13fGK7KIlZSF2 deleted.
list
.
The program continues:
} elsif (/^exit|quit$/) {
$cookie = undef;
&print_status ($success, "Bye.");
last;
The exit and quit
commands are used to exit from the server. The cookie
variable is cleared. This is very important! If it is not cleared,
the server will incorrectly assume that a cookie is already set
when a new connection is established. This can be dangerous, as
the new session can see the variables stored by the previous connection
by executing the list command.
} elsif (!/^\s*$/) {
&print_status ($error, "Invalid command.");
}
}
}
An error message is output if the specified command is not
among the ones listed.
&close_connection (COOKIE);
&expire_old_cookies();
}
exit(0);
The
connection between the server and the client is closed. The expire_old_cookies
subroutine removes any cookies (and the information associated with
them) that have expired. In reality, the cookies are not necessarily
expired after the predefined amount of time, but are checked (and
removed) when a connection terminates.
The print_status subroutine simply displays
a status and the message.
sub print_status
{
local ($status, $message) = @_;
print $status, ": ", $message, "\n";
}
The generate_new_cookie subroutine generates a random and
unique cookie by using the crypt function to
encrypt a string that is based on the current time and the remote
address. The algorithm used in creating a cookie is arbitrary; you
can use just about any algorithm to generate random cookies.
sub generate_new_cookie
{
local ($remote) = @_;
local ($random, $temp_address, $cookie_string, $new_cookie);
$random = rand (time);
($temp_address = $remote) =~ s/\.//g;
$cookie_string = join ("", $temp_address, time) / $random;
$new_cookie = crypt ($cookie_string, $random);
return ($new_cookie);
}
The expire_old_cookies subroutine removes cookies after a
pre-defined period of time. The foreach loop iterates through the
associative array, searching for keys that do not contain the separator
(i.e., the original key). For each original key, the sum of the
creation time and the expiration time (in seconds) is compared with
the current time. If the cookie has expired, the remove_cookie
subroutine is called to delete the cookie.
sub expire_old_cookies
{
local ($current_time, $key, $cookie_time);
$current_time = time;
foreach $key (keys %DATA) {
if ($key !~ /$separator/) {
$cookie_time = (split(/::/, $DATA{$key}))[2];
if ( $current_time >= ($cookie_time + $expire_time) ) {
&remove_cookie ($key);
}
}
}
}
The remove_cookie subroutine deletes the cookie:
sub remove_cookie
{
local ($cookie_key) = @_;
local ($key, $exact_cookie);
$exact_cookie = (split(/::/, $DATA{$cookie_key}))[1];
foreach $key (keys %DATA) {
if ($key =~ /$exact_cookie/) {
delete $DATA{$key};
}
}
$no_cookies--;
}
The loop iterates through the array, searches for all keys
that contain the cookie identifier, and deletes them. The counter
is decremented when a cookie is removed.
Now, let's look at the CGI program that communicates with
this server to keep state.
Let's
review what a cookie client is, and what it needs from a server.
A client is a CGI program that has to run many times for each user
(usually because it displays multiple forms and is invoked each
time by each form). The program needs to open a connection to the
cookie server, create a cookie, and store information in it. The
information stored for one form is retrieved later when the user
submits another form.
#!/usr/local/bin/perl
require "sockets.pl";
$webmaster = "Shishir Gundavaram (shishir\@bu\.edu)";
$remote_address = $ENV{'REMOTE_ADDR'};
The remote address of the host that is connected to this HTTP
server is stored. This information will be used to create unique
cookies.
$cookie_server = "cgi.bu.edu";
$cookie_port = 5000;
$document_root = "/usr/local/bin/httpd_1.4.2/public";
$error = "Cookie Client Error";
&parse_form_data (*FORM);
$start_form = $FORM{'start'};
$next_form = $FORM{'next'};
$cookie = $FORM{'Magic_Cookie'};
Initially, the browser needs to pass a query to this program,
indicating the first form:
https://some.machine/cgi-bin/cookie_client.pl?start=/interests.html
All forms after that must contain a next query in the <FORM>
tag:
<FORM ACTION="/cgi-bin/cookie_client.pl?next=/location.html" METHOD="POST">
The filename passed in the name query can be different for
each form. That is how the forms let the user navigate.
Finally, there must be a hidden field in each form that contains
the cookie:
<INPUT TYPE="hidden" NAME="Magic_Cookie" VALUE="-*Cookie*-">
This script will replace the string "-*Cookie*-" with a unique
cookie, retrieved from the cookie server. This identifier allows
one form to retrieve what another form has stored.
One way to think of this cookie technique is this: The cookie
server stores all the data this program wants to save. To retrieve
the data, each run of the program just needs to know the cookie.
One instance of the program passes this cookie to the next instance
by placing it in the form. The form then sends the cookie to the
new instance of the program.
if ($start_form) {
$cookie = &get_new_cookie ();
&parse_form ($start_form, $cookie);
If the specified form is the first one in the series, the
get_new_cookie subroutine is called to retrieve
a new cookie identifier. And the parse_form
subroutine is responsible for placing the actual cookie in the hidden
field.
} elsif ($next_form) {
&save_current_form ($cookie);
&parse_form ($next_form, $cookie);
Either $start_form or $next_form
will be set, but the browser should not set both. There is only
one start to a session! If the form contains the next query, the
information within it is stored on the cookie server, which is accomplished
by the save_current_form subroutine.
} else {
if ($cookie) {
&last_form ($cookie);
} else {
&return_error (500, $error,
"You have executed this script in an invalid manner.");
}
}
exit (0);
Finally, if the form does not contain any query information,
but does contain a cookie identifier, the last_form
subroutine is called to display all of the stored information.
That is the end of the main program. It simply lays out a
structure. If each form contains the correct start or next query,
the program will display everything when the user wants it.
The open_and_check subroutine simply connects to the cookie
server and reads the first line (remove the trailing newline character)
that is output by the server. It then checks this line to make sure
that the server is functioning properly.
sub open_and_check
{
local ($first_line);
&open_connection (COOKIE, $cookie_server, $cookie_port)
|| &return_error (500, $error, "Could not connect to cookie server.");
chop ($first_line = <COOKIE>);
if ($first_line !~ /^200/) {
&return_error (500, $error, "Cookie server returned an error.");
}
}
The get_new_cookie subroutine issues the new
command to the server and then checks the status to make sure that
a unique cookie identifier was output by the server.
sub get_new_cookie
{
local ($cookie_line, $new_cookie);
&open_and_check ();
print COOKIE "new ", $remote_address, "\n";
chop ($cookie_line = <COOKIE>);
&close_connection (COOKIE);
if ( ($new_cookie) = $cookie_line =~ /^200: (\S+)$/) {
return ($new_cookie);
} else {
&return_error (500, $error, "New cookie was not created.");
}
}
The parse_form subroutine constructs
and displays a dynamic form. It reads the entire contents of the
form from a file, such as location.html. The
only change this subroutine makes is to replace the string "-*Cookie*-"
with the unique cookie returned by the cookie server. The form passes
the cookie as input data to the program, and the program passes
the cookie to the server to set and list data.
sub parse_form
{
local ($form, $magic_cookie) = @_;
local ($path_to_form);
if ($form =~ /\.\./){
&return_error (500, $error, "What are you trying to do?");
}
$path_to_form = join ("/", $document_root, $form);
open (FILE, "<" . $path_to_form)
|| &return_error (500, $error, "Could not open form.");
print "Content-type: text/html", "\n\n";
while (<FILE>) {
if (/-\*Cookie\*-/) {
s//$magic_cookie/g;
}
print;
}
close (FILE);
}
The save_current_form subroutine stores the form information
on the cookie server.
sub save_current_form
{
local ($magic_cookie) = @_;
local ($ignore_fields, $cookie_line, $key);
$ignore_fields = '(start|next|Magic_Cookie)';
&open_and_check ();
print COOKIE "cookie $magic_cookie $remote_address", "\n";
chop ($cookie_line = <COOKIE>);
The cookie command is issued to the server
to set the cookie for subsequent add, delete, and list operations.
if ($cookie_line =~ /^200/) {
foreach $key (keys %FORM) {
next if ($key =~ /\b$ignore_fields\b/o);
print COOKIE $key, "=", $FORM{$key}, "\n";
chop ($cookie_line = <COOKIE>);
if ($cookie_line !~ /^200/) {
&return_error (500, $error, "Form info. could not be stored.");
}
}
} else {
&return_error (500, $error, "The cookie could not be set.");
}
&close_connection (COOKIE);
}
The foreach loop iterates through the associative array containing
the form information. All fields, with the exception of start,
next, and Magic_Cookie,
are stored on the cookie server. These fields are used internally
by this program, and are not meant to be stored. If the server cannot
store the information, it returns an error.
The last_form subroutine is executed when the last form in
the series is being processed. The list command
is sent to the server. The display_all_items
subroutine reads and displays the server output in response to this
command. Finally, the cookie is deleted.
sub last_form
{
local ($magic_cookie) = @_;
local ($cookie_line, $key_value, $key, $value);
&open_and_check ();
print COOKIE "cookie $magic_cookie $remote_address", "\n";
chop ($cookie_line = <COOKIE>);
if ($cookie_line =~ /^200/) {
print COOKIE "list", "\bn";
&display_all_items ();
print COOKIE "delete", "\n";
} else {
&return_error (500, $error, "The cookie could not be set.");
}
&close_connection (COOKIE);
}
The display_all_items subroutine prints a summary of the user's
responses.
sub display_all_items
{
local ($key_value, $key, $value);
print "Content-type: text/html", "\n\n";
print "<HTML>", "\n";
print "<HEAD><TITLE>Summary</TITLE></HEAD>", "\n";
print "<BODY>", "\n";
print "<H1>Summary and Results</H1>", "\n";
print "Here are the items/options that you selected:", "<HR>", "\n";
while (<COOKIE>) {
chop;
last if (/^\.$/);
$key_value = (split (/\s/, $_, 2))[1];
($key, $value) = split (/=/, $key_value);
print "<B>", $key, " = ", $value, "</B>", "<BR>", "\n";
}
The while loop reads the output from the server, and parses
and displays the key-value pair.
foreach $key (keys %FORM) {
next if ($key =~ /^Magic_Cookie$/);
print "<B>", $key, " = ", $FORM{$key}, "</B>", "<BR>", "\n";
}
print "</BODY></HTML", "\n";
}
The key-value pairs from this last form are also displayed,
since they are not stored on the server.
Finally, the familiar parse_form_data
subroutine concatenates the key-value pairs from both the query
string (GET) and from standard input (POST),
and stores them in an associative array.
sub parse_form_data
{
local (*FORM_DATA) = @_;
local ($query_string, @key_value_pairs, $key_value, $key, $value);
read (STDIN, $query_string, $ENV{'CONTENT_LENGTH'});
if ($ENV{'QUERY_STRING'}) {
$query_string = join("&", $query_string, $ENV{'QUERY_STRING'});
}
@key_value_pairs = split (/&/, $query_string);
foreach $key_value (@key_value_pairs) {
($key, $value) = split (/=/, $key_value);
$key =~ tr/+/ /;
$value =~ tr/+/ /;
$key =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
$value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
if (defined($FORM_DATA{$key})) {
$FORM_DATA{$key} = join ("\0", $FORM_DATA{$key}, $value);
} else {
$FORM_DATA{$key} = $value;
}
}
}