The hardest aspect of developing CGI applications
on the Web is the testing/debugging phase. The main reason for the
difficulty is that applications are being run across a network,
with client and server interaction. When there are errors in CGI
programs, it is difficult to figure out where they lie.
In this chapter, we will discuss some of the common errors
in CGI script design, and what you can do to correct them. In addition,
we will look at a debugging/lint tool for CGI applications, called
CGI Lint, written exclusively for this book.
Initially, we will discuss some of the simpler errors found
in CGI application design. Most CGI designers encounter these errors
at one time or another. However, they are extremely easy to fix.
Most servers
require that CGI scripts reside in a special directory (/cgi-bin),
or have certain file extensions. If you try to execute a script
that does not follow the rules for a particular server, the server
will simply retrieve and display the document, instead of executing
it. For example, if you have the following two lines in your NCSA
server resource map configuration file (srm.conf):
ScriptAlias /my-cgi-apps/ /usr/local/bin/httpd_1.4.2/cgi-bin/
AddType application/x-httpd-cgi .cgi .pl
the server will execute only scripts with URLs that either
contain the string "/my-cgi-apps," or have a file extension of .pl
or .cgi. Take a look at the following URLs
and figure out which ones the server will try to execute:
https://some.machine.com/cgi-bin/clock.tcl
https://my.machine.edu/my-cgi-apps/clock.pl
https://your.machine.org/index.cgi
https://their.machine.net/cgi-bin/animation.pl
If you picked the last three, then you are correct! Let's
look at why this so. The first one will not get executed because
the script is neither in a recognized directory (my-cgi-apps), nor
does it have a valid extension (.cgi or .pl).
The second one refers to the correct CGI directory, while the last
two have valid extensions.
If your CGI
application is a script of some sort (a C Shell, Perl, etc.), it
must contain a line that begins with #! (a "sharp-bang," or "shebang"),
or else the server will not know what interpreter to call to execute
the script. You don't have to worry about this if your CGI program
is written in C/C++, or any other language that creates a binary.
This leads us to another closely related problem, as we will soon
see.
The CGI script
must be executable by the server. Most servers are set up to run
with the
user
identification (UID) of "nobody," which means that your scripts
have to be world executable. The reason for this is that "nobody"
has minimal privileges. You can check the permissions of your script
on UNIX systems by using the ls
command:
% ls -ls /usr/local/bin/httpd_1.4.2/cgi-bin/clock.pl
4 -rwx------ 1 shishir 3624 Aug 17 17:59 clock.pl*
The second field lists the permissions for the file. This
field is divided into three parts: the privileges for the owner,
the group, and the world (from left to right), with the first letter
indicating the type of the file: either a regular file, or a directory.
In this example, the owner has sole permission to read, write, and
execute the script.
If you want the server (running as "nobody") to be able to
execute this script, you have to issue the following command:
% chmod 755 clock.pl
4 -rwx--x--x 1 shishir 3624 Aug 17 17:59 clock.pl*
The chmod
command modifies the permissions for the file. The octal code of
711 indicates read (octal 4), write (octal 2), and execute (octal
1) permissions for the owner, and execute permissions for group
members and all other members.
All CGI applications must output a valid HTTP
header, followed by a blank line, before any other data. In other
words, two newline characters have to be output after the header.
Here is how the output should look:
Content-type: text/html
<HTML>
<HEAD><TITLE>Output from CGI Script</TITLE></HEAD>
.
.
.
The headers must be output before any other data, or the server
will generate a server error with a status of 500. So make it a
habit to output this data as early in the script as possible. To
make it easier for yourself, you can use a subroutine like the following
to output the correct information:
sub output_MIME_header
{
local ($type) = @_;
print "Content-type: ", $type, "\n\n";
}
Just remember to call it at the beginning of your program
(before you output anything else). Another problem related to this
topic has to do with how the script executes. If the CGI program
has errors, then the interpreter, or compiler, will produce an error
message when trying to execute the program. These error messages
will inevitably be output before the HTTP header,
and the server will complain.
What is the moral of this? Make sure you check your script
from the command line before you try to execute it on the Web. If
you are using Perl, you can use the -wc switch
to check for syntax errors:
% perl -wc clock.pl
syntax error in file clock.pl at line 9, at EOF
clock.pl had compilation errors.
If there are no errors (but there are warnings), the Perl
interpreter will display the following:
% perl -wc clock.pl
Possible typo: "opt_g" at clock.pl line 9.
Possible typo: "opt_u" at clock.pl line 9.
Possible typo: "opt_f" at clock.pl line 9.
clock.pl syntax OK
Warnings indicate such things as possible typing errors or
use of uninitialized variables. Most of the time, these warnings
are benign, but you should still take the time to look into them.
Finally, if there are no warnings or errors to be displayed, Perl
will output the following:
% perl -wc clock.pl
clock.pl syntax OK
So it is extremely important to check to make sure the script
runs without any errors on the command line before trying it out
on the Web.