|
|
Chapter 2
Input to the Common Gateway Interface |
|
You now know the basics of how to handle and manipulate the
CGI input in Perl. If you haven't guessed by now, this book concentrates
primarily on examples in Perl, since Perl is relatively easy to
follow, runs on all three major platforms, and also happens to be
the most popular language for CGI. However, CGI programs can be
written in many other languages, so before we continue, let's see
how we can accomplish similar things in some other languages, such
as C/C++, the C Shell, and Tcl.
Here
is a CGI program written in C (but that will also compile under
C++) that parses the
HTTP_USER_AGENT
environment variable and outputs a message, depending on the type
of browser:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main (void)
{
char *http_user_agent;
printf ("Content-type: text/plain\n\n");
http_user_agent = getenv ("HTTP_USER_AGENT");
if (http_user_agent == NULL) {
printf ("Oops! Your browser failed to set the HTTP_USER_AGENT ");
printf ("environment variable!\n");
} else if (!strncmp (http_user_agent, "Mosaic", 6)) {
printf ("I guess you are sticking with the original, huh?\n");
} else if (!strncmp (http_user_agent, "Mozilla", 7)) {
printf ("Well, you are not alone. A majority of the people are ");
printf ("using Netscape Navigator!\n");
} else if (!strncmp (http_user_agent, "Lynx", 4)) {
printf ("Lynx is great, but go get yourself a graphic browser!\n");
} else {
printf ("I see you are using the %s browser.\n", http_user_agent);
printf ("I don't think it's as famous as Netscape, Mosaic or Lynx!\n");
}
exit (0);
}
The getenv
function returns the value of the environment variable, which we
store in the http_user_agent variable (it's
actually a pointer to a string, but don't worry about this terminology).
Then, we compare the value in this variable to some of the common
browser names with the strncmp function. This
function searches the http_user_agent variable
for the specified substring up to a certain position within the
entire string.
You might wonder why we're performing a partial search. The
reason is that generally, the value returned by the HTTP_USER_AGENT
environment variable looks something like this:
In this case, we need to search only the first four characters
for the string "Lynx" in order to determine that the browser being
used is Lynx. If there is a match, the strncmp
function returns a value of zero, and we display the appropriate
message.
The
C
Shell has some serious limitations and therefore is not recommended
for any type of CGI applications. In fact, UNIX
guru Tom Christiansen has written a FAQ titled "Csh Programming
Considered Harmful" detailing the C Shell's problems. Here is a
small excerpt from the document:
The csh is seductive because the conditionals
are more C-like, so the path of least resistance is chosen and a
csh script is written. Sadly, this is a lost cause, and the programmer
seldom even realizes it, even when they find that many simple things
they wish to do range from cumbersome to impossible in the csh.
However, for
completeness sake, here is a simple shell script that is identical
to the first unix.pl Perl program discussed
earlier:
#!/bin/csh
echo "Content-type: text/plain"
echo ""
if ($?QUERY_STRING) then
set command = `echo $QUERY_STRING | awk 'BEGIN {FS = "="} { print $2 }'`
if ($command == "fortune") then
/usr/local/bin/fortune
else if ($command == "finger") then
/usr/ucb/finger
else
/usr/local/bin/date
endif
else
/usr/local/bin/date
endif
The C Shell does not have any inherent functions or operators
to manipulate string information. So we have no choice but to use
another UNIX utility, such as awk, to split the
query string and return the data on the right side of the equal
sign. Depending on the input from the user, one of several UNIX
utilities is called to output some information.
You may notice that the variable
QUERY_STRING
is exposed to the shell. Generally, this is very dangerous because
users can embed shell metacharacters. However, in this case, the
variable substitution is done after the
``
command is parsed into separate commands. If things happened in
the reverse order, we could potentially have a major headache!
The following
Tcl program uses an
environment variable that we haven't yet discussed up to this point.
The
HTTP_ACCEPT
variable contains a list of all of the MIME content types that a
browser can accept and handle. A typical value returned by this
variable might look like this:
application/postscript, image/gif, image/jpeg, text/plain, text/html
You can use this information to return different types of
data from your CGI document to the client. The program below parses
this accept list and outputs each MIME type on a different line:
#!/usr/local/bin/tclsh
puts "Content-type: text/plain\n"
set http_accept $env(HTTP_ACCEPT)
set browser $env(HTTP_USER_AGENT)
puts "Here is a list of the MIME types that the client, which"
puts "happens to be $browser, can accept:\n"
set mime_types [split $http_accept ,]
foreach type $mime_types {
puts "- $type"
}
exit 0
As in Perl, the split command splits
a string on a specified delimiter, placing all of the resulting
substrings in an array. In this case, the mime_types
array contains each MIME type from the accept
list. Once that's done, the foreach loop iterates
through the array, displaying each element.
|
|