|
|
Chapter 2
Input to the Common Gateway Interface |
|
Finally, let's
get to form input. We mentioned forms briefly in Chapter 1,
The Common Gateway Interface, and we'll cover
them in more detail in Chapter 4, Forms and CGI.
But here, we just want to introduce you to the basic concepts behind
forms.
As we described in Chapter 1, forms provide a way to get input
from users and supply it to a CGI program, as shown in Figure 2.1.
The Web browser allows the user to select or type in information,
and then sends it to the server when the Submit button is pressed.
In this chapter, we'll talk a little about how the CGI program accesses
the form input.
One
way to send form data to a CGI program is by appending the form
information to the URL, after a question mark. You may have seen
URLs like the following:
https://some.machine/cgi-bin/name.pl?fortune
Up to the
question mark (?), the URL should look
familiar. It is merely a CGI script being called, by the name name.pl.
What's new here is the part after the "?". The information
after the "?" character is known as a query string.
When the server is passed a URL with a query string, it calls the
CGI program identified in the first part of the URL (before the
"?") and then stores the part after the "?" in the environment
variable QUERY_STRING. The
following is a CGI program called name.pl that uses query information
to execute one of three possible UNIX commands.
#!/usr/local/bin/perl
print "Content-type: text/plain", "\n\n";
$query_string = $ENV{'QUERY_STRING'};
if ($query_string eq "fortune") {
print `/usr/local/bin/fortune`;
} elsif ($query_string eq "finger") {
print `/usr/ucb/finger`;
} else {
print `/usr/local/bin/date`;
}
exit (0);
You can execute this script as either:
https://some.machine/cgi-bin/name.pl?fortune
https://some.machine/cgi-bin/name.pl?finger
or
https://some.machine/cgi-bin/name.pl
and you will get different output. The CGI program executes
the appropriate system command (using backtics) and the results
are sent to standard output. In Perl, you can use backtics to capture
the output from a system command.
You
should always be very careful when executing any type of system
commands in CGI applications, because of possible security problems.
You should never do something like this:
The danger is that a diabolical user can enter a dangerous
system command, such as:
which can delete everything on your system.
Nor should you expose any system data, such as a list of system
processes, to the outside world.
Although the previous example will work, the following example
is a more realistic illustration of how forms work with CGI. Instead
of supplying the information directly as part of the URL, we'll
use a form to solicit it from the user.
(Don't worry about the HTML tags needed
to create the form; they are covered in detail in Chapter 4, Forms and CGI.)
<HTML>
<HEAD><TITLE>Simple Form!</TITLE></HEAD>
<BODY>
<H1>Simple Form!</H1>
<HR>
<FORM ACTION="/cgi-bin/unix.pl" METHOD="GET">
Command: <INPUT TYPE="text" NAME="command" SIZE=40>
<P>
<INPUT TYPE="submit" VALUE="Submit Form!">
<INPUT TYPE="reset" VALUE="Clear Form">
</FORM>
<HR>
</BODY>
</HTML>
Since this is HTML, the appearance of the
form depends on what browser is being used. Figure 2.2 shows what
the form looks like in Netscape.
This form consists of one text field titled "Command:" and
two buttons. The Submit Form! button is used to send the information
in the form to the CGI program specified by the ACTION
attribute. The Clear Form button clears the information in the field.
The METHOD=GET attribute
to the <FORM> tag in part determines how the data is passed
to the server. We'll talk more about different methods soon, but
for now, we'll use the default method, GET. Now,
assuming that the user enters "fortune" into the text field, when
the Submit Form! button is pressed the browser sends the following
request to the server:
GET /cgi-bin/unix.pl?command=fortune HTTP/1.0
.
. (header information)
.
The server executes the script called unix.pl
in the cgi-bin directory, and places the string "command=fortune"
into the QUERY_STRING environment variable. Think
of this as assigning the variable "command" (specified by the NAME
attribute to the <INPUT> tag) with the string supplied by the
user, "fortune".
Let's go through the simple unix.pl CGI program that handles
this form:
#!/usr/local/bin/perl
print "Content-type: text/plain", "\n\n";
$query_string = $ENV{'QUERY_STRING'};
($field_name, $command) = split (/=/, $query_string);
After printing the content type (text/plain
in this case, since the UNIX programs are unlikely
to produce HTML output) and getting the query
string from the %ENV array, we use the split
function to separate the query string on the "=" character into
two parts, with the first part before the equal sign in $field_name,
and the second part in $command. In this case,
$field_name will contain "command" and $command
will contain "fortune." Now, we're ready to execute the UNIX
command:
if ($command eq "fortune") {
print `/usr/local/bin/fortune`;
} elsif ($command eq "finger") {
print `/usr/ucb/finger`;
} else {
print `/usr/local/bin/date`;
}
exit (0);
Since we used the GET method, all the form
data is included in the URL. So we can directly access this program
without the form, by using the following URL:
https://some.machine/cgi-bin/unix.pl?command=fortune
It will work exactly as if you had filled out the form and
submitted it.
In the previous example, we used the GET
method to process the form. However, there is another method we
can use, called POST. Using the POST method, the server sends the data
as an input stream to the program. That is, if in the previous example
the <FORM> tag had read:
<FORM ACTION="unix.pl" METHOD="POST">
the following request would be sent to the server:
POST /cgi-bin/unix.pl HTTP/1.0
.
. (header information)
.
Content-length: 15
command=fortune
The version of unix.pl that handles the form with POST
data follows. First, since the server passes information to this
program as an input stream, it sets the environment variable
CONTENT_LENGTH to the size
of the data in number of bytes (or characters). We can use this
to read exactly that much data from standard input.
#!/usr/local/bin/perl
$size_of_form_information = $ENV{'CONTENT_LENGTH'};
Second, we read the number of bytes, specified by $size_of_form_information,
from standard input into the variable $form_info.
read (STDIN, $form_info, $size_of_form_information);
Now we can split the $form_info variable
into a $field_name and $command,
as we did in the GET version of this example.
As with the GET version, $field_name
will contain "command," and $command will contain
"fortune" (or whatever the user typed in the text field). The rest
of the example remains unchanged:
($field_name, $command) = split (/=/, $form_info);
print "Content-type: text/plain", "\n\n";
if ($command eq "fortune") {
print `/usr/local/bin/fortune`;
} elsif ($command eq "finger") {
print `/usr/ucb/finger`;
} else {
print `/usr/local/bin/date`;
}
exit (0);
Since it's the form that determines whether the GET
or POST method is used, the CGI programmer can't
control which method the program will be called by. So scripts are
often written to support both methods. The following example will
work with both methods:
#!/usr/local/bin/perl
$request_method = $ENV{'REQUEST_METHOD'};
if ($request_method eq "GET") {
$form_info = $ENV{'QUERY_STRING'};
} else {
$size_of_form_information = $ENV{'CONTENT_LENGTH'};
read (STDIN, $form_info, $size_of_form_information);
}
($field_name, $command) = split (/=/, $form_info);
print "Content-type: text/plain", "\n\n";
if ($command eq "fortune") {
print `/usr/local/bin/fortune`;
} elsif ($command eq "finger") {
print `/usr/ucb/finger`;
} else {
print `/usr/local/bin/date`;
}
exit (0);
The environment variable
REQUEST_METHOD
contains the request method used by the form. In this example, the
only new thing we did was check the request method and then assign
the $form_info variable as needed.
So far, we've
shown an example for retrieving very simple form information. However,
form information can get complicated. Since under the GET
method the form information is sent as part of the URL, there can't
be any spaces or other special characters that are not allowed in
URLs. Therefore, some special encoding is used. We'll talk more
about this in Chapter 4, Forms and CGI, but for now
we'll show a very simple example. First the HTML
needed to create a form:
<HTML>
<HEAD><TITLE>When's your birthday?</TITLE></HEAD>
<BODY>
<H1>When's your birthday?</H1>
<HR>
<FORM ACTION="/cgi-bin/birthday.pl" METHOD="POST">
Birthday (in the form of mm/dd/yy): <INPUT TYPE="text" NAME="birthday" SIZE=40>
<P>
<INPUT TYPE="submit" VALUE="Submit Form!">
<INPUT TYPE="reset" VALUE="Clear Form">
</FORM>
<HR>
</BODY>
</HTML>
When the user submits the form, the client issues the following
request to the server (assuming the user entered 11/05/73):
POST /cgi-bin/birthday.pl HTTP/1.0
.
. (information)
.
Content-length: 21
birthday=11%2F05%2F73
In the encoded form, certain characters, such as spaces and
other character symbols, are replaced by their hexadecimal equivalents.
In this example, our program needs to "decode" this data, by converting
the "%2F" to "/".
Here is the CGI program-birthday.pl-that
handles this form:
#!/usr/local/bin/perl
$size_of_form_information = $ENV{'CONTENT_LENGTH'};
read (STDIN, $form_info, $size_of_form_information);
The following complicated-looking regular expression is used
to "decode" the data (see Chapter 4, Forms and CGI for
a comprehensive explanation of how this works).
$form_info =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
In the case of this example, it will turn "%2F" into "/".
The rest of the program should be easy to follow:
($field_name, $birthday) = split (/=/, $form_info);
print "Content-type: text/plain", "\n\n";
print "Hey, your birthday is on: $birthday. That's what you told me, right?", "\n";
exit (0);
|
|