Using hidden fields is probably the simplest
way to maintain information across multiple CGI instances. But it
is far from the most efficient.
In this next example of maintaining state, we embed special
codes into HTML documents that resemble Server
Side Includes (see Chapter 5, Server Side Includes, for more
information on Server Side Includes). These codes are actually parsed
by a CGI program which uses the codes to maintain information across
several documents. This algorithim is best illustrated via example.
Let's
create a multiple survey form system. Here is the first form of
the survey:
<HTML>
<HEAD><TITLE>Television/Movie Survey</TITLE></HEAD>
<BODY>
<H1>Welcome to the CGI Network!</H1>
<HR>
In order to better serve you, we would like to know what type of
movies and variety shows you like to watch on TV. Over the last couple
of years, you, the viewers, were directly responsible for the lasting
success of many of our shows. Your comments are extremely valuable to
us, so please take a few moments to fill out a survey.
<P>
The current time is: <!--#insert var="DATE_TIME"--><BR>
At first glance, the construct in the last line displayed
above looks like a Server Side Include. However, it is not! This
document first gets parsed by a CGI program that looks for statements
like these and replaces them with appropriate information. Let's
refer to these statements as CGI Side Includes (
CSIs), or "pseudo" Server Side Includes. In
this case, the program will insert the current date and time.
You may ask, what is the advantage of such a process? It allows
you to insert dynamic information in otherwise static documents.
Another alternative to this would be to place the information contained
within the document in the program, such as:
print <<End_of_Form;
<HTML>
<HEAD><TITLE>Sample Form</TITLE></HEAD>
<BODY>
<H1>This is a test of a sample form</H1>
The current time is: $date_time
<HR>
.
.
.
</BODY></HTML>
End_of_Form
As you can see, this can be quite cumbersome, especially if
the document is large. Now, let's proceed with the rest of the form.
<HR>
<FORM ACTION="/cgi-bin/survey.pl?
cgi_cookie=<!--#insert var="COOKIE"-->&
cgi_form_num=<!--#insert var="NUMBER"-->" METHOD="POST">
As in other examples in this book, a query is passed to the
program as part of the ACTION attribute. Notice
the two CSI statements in the <FORM> tag.
The first one inserts a random number--also referred to as a
magic cookie--for
identification purposes, and the second one inserts the form number.
A cookie is needed to store the information from the various forms
in a unique data file. This cookie is passed to each and every form,
so that the form data is appended to the same data file. A form
number is needed to keep track of the various forms. We will discuss
these statements in detail later in this chapter.
<PRE>
Full Name: <INPUT TYPE="text" NAME="01 Full Name" SIZE=40>
E-Mail: <INPUT TYPE="text" NAME="02 EMail Address" SIZE=40>
The field names are prefixed with numbers, so that they can
be sorted. This makes it possible to store the form data in the
order in which it is displayed in the form. Remember, you do not
need to encode the field names, as the browser will do so before
it submits the information to the server.
</PRE>
<P>
Which survey would you like to fill out: <BR>
<INPUT TYPE="radio" NAME="cgi_survey" VALUE="Television" CHECKED>Television<BR>
<INPUT TYPE="radio" NAME="cgi_survey" VALUE="Movie">Movies<BR>
<P>
<INPUT TYPE="submit" VALUE="Submit the survey">
<INPUT TYPE="reset" VALUE="Clear all fields">
</FORM>
<HR>
</BODY></HTML>
The document is passed to the CGI program as extra path information.
For example, if you want the program to parse the CSI statements
and display the form, the following URL should be used:
https://your.machine/survey.pl/start_survey.html
where the file "/start_survey.html" contains the first form
of the survey. In the context of this example, if the user opts
to fill out the "Television" survey, the following two forms are
displayed, one after the other:
<HTML>
<HEAD><TITLE>Television/Movie Survey</TITLE></HEAD>
<BODY>
<H1>Television Survey</H1>
<HR>
Welcome! We are glad that you have decided to fill out our
television survey. Please read all questions carefully. When you are finished,
press the Submit button for Part 2 of the survey.
<P>
The current time is: <!--#insert var="DATE_TIME"--><BR>
The date and time are inserted into the form using CGI side
includes.
<HR>
<FORM ACTION="/cgi-bin/survey.pl?cgi_cookie=<!--#insert var="COOKIE"-->&cgi_survey=<!--#insert var="SURVEY"-->&cgi_form_num=<!--#insert var="NUMBER"-->" METHOD="POST">
The variable "SURVEY" inserts the user-selected
survey type, either "Television" or "Movie." The survey type is
retrieved from the information submitted by the user in the first
form. This ensures that the correct series of forms are displayed.
What is your favorite comedy show?
<BR>
<INPUT TYPE="radio" NAME="03 Comedy Show" VALUE="Single Web Dude">Single Web Dude<BR>
<INPUT TYPE="radio" NAME="03 Comedy Show" VALUE="Gateway Friends">Gateway Friends<BR>
<INPUT TYPE="radio" NAME="03 Comedy Show" VALUE="Mad About CGI" CHECKED>Mad About CGI<BR>
<INPUT TYPE="radio" NAME="03 Comedy Show" VALUE="Web Time">Web Time<BR>
<P>
Who is your favorite actor in a comedy show?
<BR>
<INPUT TYPE="radio" NAME="04 TV Comedian" VALUE="John Riser" CHECKED>John Riser<BR>
<INPUT TYPE="radio" NAME="04 TV Comedian" VALUE="Jake LeBlanc">Jake LeBlanc<BR>
<INPUT TYPE="radio" NAME="04 TV Comedian" VALUE="Mike Cosby">Mike Cosby<BR>
<INPUT TYPE="radio" NAME="04 TV Comedian" VALUE="Marc Allen">Marc Allen<BR>
<P>
<INPUT TYPE="submit" VALUE="Submit the survey">
<INPUT TYPE="reset" VALUE="Clear all fields">
</FORM>
<HR>
</BODY></HTML>
The field names are prefixed with numerical values. Notice
the long, descriptive names for the field names and values. This
allows us to simply retrieve the names and values, decode them,
and print them out.
Now, here is the second, and final, form in the "Television"
survey:
<HTML>
<HEAD><TITLE>Television/Movie Survey</TITLE></HEAD>
<BODY>
<H1>Televison Survey</H1>
<HR>
Thanks for filling out Part 1 of our TV survey. Here is
Part 2... Again, please read all questions carefully. When you are finished,
press the Submit button to wrap up the survey.
<P>
The current time is: <!--#insert var="DATE_TIME"--><BR>
<HR>
<FORM ACTION="/cgi-bin/survey.pl?cgi_cookie=<!--#insert var="COOKIE"-->&cgi_survey=<!--#insert var="SURVEY"-->&cgi_form_num=<!--#insert var="NUMBER"-->" METHOD="POST">
What is your favorite action/drama show?
<BR>
<INPUT TYPE="radio" NAME="05 TV Drama" VALUE="Masquerade on the Web">Masquerade on the Web<BR>
<INPUT TYPE="radio" NAME="05 TV Drama" VALUE="Gateway Voyager">Gateway Voyager<BR>
<INPUT TYPE="radio" NAME="05 TV Drama" VALUE="EH" CHECKED>EH - Emergency HTTP Server<BR>
<INPUT TYPE="radio" NAME="05 TV Drama" VALUE="W3C Hope">W3C Hope<BR>
<P>
Who is your favorite actor in an action/drama show?
<BR>
<INPUT TYPE="radio" NAME="06 TV Drama Actor" VALUE="Bill Wyle" CHECKED>Bill Wyle<BR>
<INPUT TYPE="radio" NAME="06 TV Drama Actor" VALUE="John Clooney">John Clooney<BR>
<INPUT TYPE="radio" NAME="06 TV Drama Actor" VALUE="Mike Strauss">Mike Strauss<BR>
<INPUT TYPE="radio" NAME="06 TV Drama Actor" VALUE="Eric Wagner">Eric Wagner<BR>
<P>
<INPUT TYPE="submit" VALUE="Submit the survey">
<INPUT TYPE="reset" VALUE="Clear all fields">
</FORM>
<HR>
</BODY></HTML>
The two forms for the "Movie" survey are set up in the same
manner as the ones illustrated above. Let's look at the program:
#!/usr/local/bin/perl
$exclusive_lock = 2;
$unlock = 8;
$request_method = $ENV{'REQUEST_METHOD'};
$webmaster = "shishir\@bu\.edu";
$document_root = "/home/shishir/httpd_1.4.2/public";
$survey_dir = "/tmp/";
The variable survey_dir contains the
directory where the data files are stored. Whenever you are creating
temporary files, you should store them in /tmp
or /var/tmp, as these directories are cleaned
out every few days.
@Television_files = ( "/tv_1.html", "/tv_2.html" );
@Movie_files = ( "/movie_1.html", "/movie_2.html" );
These two arrays store the HTML survey
files that must be parsed for CSI statements. The most important
thing to note here is the way the variables are labeled. The first
part of the variable name--before the "_" character--corresponds to
the value of the cgi_survey field in the initial
form. The program determines the survey type chosen by the user--either
"Television" or "Movie"--and concatenates that string with "_files"
and evaluates the total string at run-time to determine the next
survey file.
if ($request_method eq "GET") {
$form_num = 0;
$type = "start";
$form_file = $ENV{'PATH_INFO'};
Using the GET method indicates that the
user requested the starting form, which will be stored in PATH_INFO.
The form_num variable indicates the current
form number. In this case, zero indicates the starting form.
The type variable is set to "start".
However, this value is never used because there is no corresponding
CSI in the initial form. It is just defined for clarity. Remember,
the manner in which the starting form must be accessed is a GET
request:
https://your.machine/cgi-bin/survey.pl/start_survey.html
After the first form is submitted, the server will execute
this program with a POST request and an additional
query. The process is repeated for all the forms in the survey.
if ($form_file) {
$cookie = join ("_", $ENV{'REMOTE_HOST'}, time);
$cookie = &escape($cookie);
&pseudo_ssi ($form_file, $cookie, $type, $form_num);
} else {
&return_error (500, "CGI Network Survey Error",
"An initial survey form must be specified.");
}
Since the starting form was accessed, a new
cookie
has to be created. This cookie is simply the client's host address
concatenated with the current time. Perl's time command returns the current time
as the number of seconds since 1970. This ensures that every user
has a different cookie.
The escape subroutine encodes the cookie
string for insertion into the form. Finally, the pseudo_ssi
subroutine reads and parses the file specified by the variable form_file
for CSI statements. The three parameters that are passed to the
subroutine are the new cookie, the dummy form type, and the form
number. If corresponding CSI statements are found, the values stored
in these variables will be inserted appropriately.
} elsif ($request_method eq "POST") {
&parse_form_data(*STATE);
$form_num = $STATE{'cgi_form_num'};
$type = $STATE{'cgi_survey'};
$cookie = $STATE{'cgi_cookie'};
The form information is retrieved and stored in the STATE
associative array. The parse_form_data subroutine
is slightly different than the one used in the previous examples;
it decodes the form field name, as well as the value.
Once the initial form is submitted, form_num
variable equals zero, type contains either
"Television" or "Movie," and cookie holds a
string that uniquely identifies a user. After the initial form,
all the other forms will have the same cookie and type information.
However, the form_num variable will be incremented.
if ( ($type eq "Television") || ($type eq "Movie") ) {
This conditional is executed if the user chose to fill out
either a television or movie survey. Since one of the values is
checked by default on the form, this variable will have to contain
either "Television" or "Movie." However, if someone accesses this
program by bypassing the starting form, and specifies something
other than these two values, an error message is displayed.
$limit = eval ("scalar (\@${type}_files)");
This run-time evaluation is very important. It uses Perl's
scalar
function to determine the number of elements in the array that corresponds
to the value stored in the variable type. Here
is a simple example of scalar :
@test = (1, 2, 3);
$number = scalar (@test);
The variable number returns 3 to indicate
the existence of three elements.
if ( ($form_num >= 0) && ($form_num <= $limit) ) {
&write_data_to_file();
If the form number is within the limits, the write_data_file
subroutine is called to write the form information to a data file.
Remember, the same data file is used throughout the whole process.
On the other hand, if a user bypasses the forms, and tries to pass
a form number that is not within the limits, an error message is
displayed.
if ($form_num == $limit) {
&survey_over();
If the form is the last one in the survey, the survey_over
subroutine is called to display the information stored in the data
file. It also deletes the data file.
} else {
$form_file = eval("\$${type}_files[$form_num]");
$form_num++;
$cookie = &escape($cookie);
&pseudo_ssi ($form_file, $cookie, $type,
$form_num);
}
Again, a run-time evaluation is performed to retrieve the
name of the next file in the survey. If these two run-time evals
were not used, then two separate blocks of code have to be written:
one to handle the television survey, and the other to handle the
movie survey. It is more much efficient to do it this way.
The form number is incremented, and the cookie value is encoded.
The subroutine pseudo_ssi is called to parse
the form file.
} else {
&return_error (500, "CGI Network Survey Error",
"You have somehow selected an invalid form!");
}
} else {
&return_error (500, "CGI Network Survey Error",
"You have selected an invalid survey type!");
}
} else {
&return_error (500, "Server Error",
"Server uses unsupported method");
}
exit(0);
If the user somehow passed invalid information to the program,
error messages are returned.
Now for the subroutines. The pseudo_ssi
subroutine parses the CSI statements.
sub pseudo_ssi
{
local ($file, $id, $kind, $number) = @_;
local ($command, $argument, $parameter, $line);
$file = $document_root . $file;
open (FILE, "<" . $file) ||
&return_error (500, "CGI Network Survey Error",
"Cannot open: form [$number], file [$file].");
flock (FILE, $exclusive_lock);
The subroutine tries to open the specified file. An error
message is returned if the operation fails.
print "Content-type: text/html", "\n\n";
while (<FILE>) {
while ( ($command, $argument, $parameter) =
(/<!--\s*#\s*(\w+)\s+(\w+)\s*=\s*"?(\w+)"?\s*-->/io) ) {
The initial loop iterates through each line in the file, and
stores it in the default variable $_. The second
loop uses a regular expression to check for a CSI statement within
the file. Here is the format for the CSI statement:
<!--#command argument="parameter"-->
Whitespace is ignored, and the quotation marks around the
parameter are optional. This is in great contrast to SSI statements,
where a strict format is enforced.
if ($command eq "insert") {
if ($argument eq "var") {
if ($parameter eq "COOKIE") {
s//$id/;
} elsif ($parameter eq "DATE_TIME") {
local ($time) = &get_date_time();
s//$time/;
} elsif ($parameter eq "NUMBER") {
s//$number/;
} elsif ($parameter eq "SURVEY") {
s//$kind/;
} else {
s///;
}
} else {
s///;
}
} else {
s///;
}
}
print;
}
This block might look very confusing, but it is quite simple.
This program only supports the insert
command and the var
argument. However, four parameters are allowed: COOKIE,
DATE_TIME, NUMBER, and SURVEY.
Notice the strange substitute command. The initial string
to substitute is not specified. Usually, the format of the substitute
command looks like this:
Perl will work on the default variable $_.
However, if no initial string is specified, Perl automatically uses
the last matched regular expression. This just so happens to be
the CSI statement that matched earlier. This is a good trick in
Perl, because it is very efficient.
The subroutine simply checks to see the parameter of the CSI,
and replaces the information appropriately. The get_date_time
subroutine is the same as the one used previously. If the command,
argument, or parameter specified in the file does not match the
ones listed, the substitute command is used to remove the CSI statement.
Note the following format:
Perl replaces the last matched regular expression with a null
string. It is very important to remove these unmatched CSI statements,
or else the enclosing while loop will run forever.
The reason for this is that the loop repeatedly checks for CSI statements.
Finally, the modified line is output. A print command without any parameters
outputs the default variable $_.
flock (FILE, $unlock);
close (FILE);
}
Before we quit the subroutine, the file is unlocked and closed.
The write_data_to_file subroutine opens
the data file and incorporates the survey results into it.
sub write_data_to_file
{
local ($key, $temp_key);
open (FILE, ">>" . $survey_dir . $cookie) ||
&return_error (500, "CGI Network Survey Error",
"Cannot write to a data file to store your info.");
if ($form_num == 0) {
print FILE $STATE{'cgi_survey'}, " Survey Filled Out", "\n";
}
The data file is opened in
append mode. There is no need to lock the file,
because every user has a unique filename. If the form number indicates
that it is the initial form, a header is output.
foreach $key (sort (keys %STATE)) {
Let's look at this construct from the innermost parentheses.
The keys command returns an array consisting
of all the keys of the associative array. The sort
function then sorts that array. And foreach
iterates through this array, storing each element in key.
Information in an associative array is not stored in any order,
because it is based on a string index. As a result, the
keys command returns
the information in a random order. Prefixing numerical values to
the form field names allows us to sort the
information returned by the keys command.
If the key name begins with "cgi_", it is omitted. Internally
used variables are prefixed with "cgi_" to keep them separate from
real form data.
($temp_key = $key) =~ s/^\d+\s//;
This regular expression is used to remove the numerical value
from the key. The modified key is stored in temp_key.
The field names in the form were in the format:
We use the regular expression to search for a string that
starts with a numeric value followed by a space.
print FILE $temp_key, ": ", $STATE{$key}, "\n";
}
}
close (FILE);
}
The new key, along with the form value, is displayed. If the
form contained a scrolling list that allowed the user to make multiple
selections, then all of the values are stored in one string, separated
by the null character, "\0". This subroutine does not perform any
formatting on such a string. However, the next ordering system example
shows how to split and display these values separately.
Note that the associative array is still indexed by the "old"
key. The new key was defined just for output purposes. Finally,
the file is closed.
The survey_over subroutine thanks the
user and prints his or her responses.
sub survey_over
{
local ($file) = $survey_dir . $cookie;
open (FILE, "<" . $file) ||
&return_error (500, "CGI Network Survey Error",
"Cannot read the survey data file [$file].");
print <<Thanks;
Content-type: text/html
<HTML>
<HEAD><TITLE>Thank You!</TITLE></HEAD>
<BODY>
<H1>Thank You!</H1>
Thank you again for filling out our survey. Here is the information
that you selected:
<HR>
<P>
Thanks
while (<FILE>) {
print $_, "<BR>";
}
print "<HR>";
print "</BODY></HTML>", "\n";
close (FILE);
unlink ($file);
}
The file is opened in read mode, and the information contained
in it is displayed to standard output. Finally, the
unlink command deletes
the file.
The escape subroutine encodes the data.
The code is very similar to the program presented at the beginning
of this book.
sub escape
{
local ($string) = @_;
$string =~ s/(\W)/sprintf("%%%x", ord($1))/eg;
return($string);
}
Finally, the parse_form_data subroutine
parses the form field name as well as the form data. That is the
only difference between this version of the subroutine and the one
presented in the earlier examples.
sub parse_form_data
{
local (*FORM_DATA) = @_;
local ($query_string, @key_value_pairs, $key_value, $key, $value);
read (STDIN, $query_string, $ENV{'CONTENT_LENGTH'});
if ($ENV{'QUERY_STRING'}) {
$query_string = join("&", $query_string, $ENV{'QUERY_STRING'});
}
@key_value_pairs = split (/&/, $query_string);
foreach $key_value (@key_value_pairs) {
($key, $value) = split (/=/, $key_value);
$key =~ tr/+/ /;
$value =~ tr/+/ /;
$key =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
$value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
if (defined($FORM_DATA{$key})) {
$FORM_DATA{$key} = join ("\0", $FORM_DATA{$key}, $value);
} else {
$FORM_DATA{$key} = $value;
}
}
}
There are other ways to accomplish an ordering or "shopping
cart" system like the one illustrated above. However, this is one
of the best ways. The only drawback to this approach involves the
temporary files that are created.
If a user decides to exit midway through the survey, the temporary
file will not be deleted, because there is no way to determine when
the user leaves. The only solution to this problem is to manually
delete files based on modification times. See Chapter 9, Gateways, Databases, and Search/Index Utilities,
for an ordering system that works by communicating with another
network server, specially designed to store and distribute information.
The hidden field technique
we described earlier allows us to modify the ordering system presented
earlier in two ways. The first is to replace the query information
in the ACTION attribute of the <FORM>
tag with hidden fields. Let's look at the starting form again:
<HTML>
<HEAD><TITLE>Television/Movie Survey</TITLE></HEAD>
<BODY>
<H1>Welcome to the CGI Network!</H1>
<HR>
In order to better serve you, we would like to know what type of
movies and variety shows you like to watch on TV. Over the last couple
of years, you, the viewers, were directly responsible for the lasting
success of many of our shows. Your comments are extremely valuable to
us, so please take a few moments to fill out a survey.
<P>
The current time is: <!--#insert var="DATE_TIME"--><BR>
If we want the current time to be displayed in the form, we
need to keep this statement.
<HR>
<FORM ACTION="/cgi-bin/survey.pl?cgi_cookie=<!--#insert var="COOKIE"-->&cgi_form_num=" METHOD="POST">
This can be modified to:
<FORM ACTION="/cgi-bin/survey.pl" METHOD="POST">
<INPUT TYPE="hidden" NAME="cgi_cookie" VALUE="<!--#insert var="COOKIE"-->"
<INPUT TYPE="hidden" NAME="cgi_form_num" VALUE="<!--#insert var="NUMBER"-->"
The program described above will replace the CSI statements
with appropriate information.
<PRE>
Full Name: <INPUT TYPE="text" NAME="01 Full Name" SIZE=40>
E-Mail: <INPUT TYPE="text" NAME="02 EMail Address" SIZE=40>
</PRE>
<P>
Which survey would you like to fill out: <BR>
<INPUT TYPE="radio" NAME="cgi_survey" VALUE="Television" CHECKED>Television<BR>
<INPUT TYPE="radio" NAME="cgi_survey" VALUE="Movie">Movies<BR>
<P>
<INPUT TYPE="submit" VALUE="Submit the survey">
<INPUT TYPE="reset" VALUE="Clear all fields">
</FORM>
<HR>
</BODY></HTML>
There is really no advantage to using this technique over
the original one, as the two are nearly identical. If you use this
method, you can remove the following line from the parse_form_data
subroutine:
if ($ENV{'QUERY_STRING'}) {
$query_string = join("&", $query_string, $ENV{'QUERY_STRING'});
}
There is no need to store any query
information.