|
|
Chapter 9
Gateways, Databases, and Search/Index Utilities |
|
So far in this chapter, we
have created pretty simple gateways by piping input to other programs.
As long as we create the proper stream of data, it takes very little
coding on our part to make these work. But the most interesting
uses for gateways involve large, complex sets of data organized
into structured databases. Piping a stream does not work for these;
we need to use a language that the database understands, such as
SQL. This is where we turn in this section.
By combining the power of relational database management systems
(RDBMS) with the World Wide Web, one can produce
impressive results. Put yourself in the shoes of a doctor who wants
to establish an "interactive, virtual infirmary" on the Web, where
users (patients) can simply enter their symptoms, and the CGI would
return with a diagnosis. The doctor has a large database that contains
extensive data, including three-dimensional graphics and multimedia,
on the various diseases and ailments that affect humans. All that
needs to be done is to write a CGI program that decodes the input,
assembles a query of some sort, sends it to the database, processes
the return data from the database, and creates a hypertext document
(with embedded multimedia) for the user. These types of applications
are possible by combining databases with the Web.
Before we go any further, let's look at SQL, the query language
used to communicate with various RDBMS systems.
SQL--pronounced "S Q L" and not
"Sequel"--is a standardized sub-language to access and manipulate
data within a relational database system. The original SQL prototype
defined a "structured" language, thus the term Structured Query
Language, but this is no longer true of the current SQL-92 standard.
SQL was designed specifically to be used in conjunction with a primary
high-level programming language. In fact, most of the basic constructs
that you would find in a high-level language, such as loops and
conditionals, do not exist in SQL.
Most of the commercial relational database systems in use
today, such as Oracle and Sybase, support SQL. As a result, the
code to access and manipulate a database can be ported easily and
quickly to any platform. Now, let's look at SQL.
We will start out
by discussing how a database is created. Suppose you have the following
information:
The SQL code to create this database is:
create table Player_Info
(
Player character varying (30) not null,
Years integer,
Points integer,
Rebounds integer,
Assists integer,
Championships integer
);
The create table command creates a database,
or a table. The Player field is stored as a
non-null varying character string. In other words, if the data in
the field is less than 30 characters, the database will not pad
it with spaces, as it would for a regular character data type. Also,
the database forces the user to enter a value for the Player
field; it cannot be empty.
The rest of the fields are defined to be integers. Some of
the other valid data types include date, time,
smallint, numeric, and
decimal. The numeric and
decimal data types allow you to specify floating-point
values. For example, if you want a five-digit floating-point number
with a precision to the hundredth place, you can specify decimal
(5, 2).
Let's say you want a list of the entire database. You can
use the following code:
select *
from Player_Info;
The select command retrieves specific
information from the database. In this case, all columns are selected
from the Player_Info database. The "*" should
be used with great caution, especially on large databases, as you
might inadvertently extract a lot of information. Notice that we
are dealing only with columns, and not rows. For example, if you
wanted to list all the players in the database, you could do this:
select Player
from Player_Info;
Now, what if you want to list all the players who scored more
than 25 points? Here is the code needed to accomplish the task:
select *
from Player_Info
where (Points > 25);
This would list all the columns for the players who scored
more than 25 points:
But, say you wanted to list just the Player
and Points columns:
select Player, Points
from Player_Info
where (Points > 25);
Here is an example that returns all the players who scored
more than 25 points and won a championship:
select Player, Points, Championships
from Player_Info
where (Points > 25) and
(Championships > 0);
The output of this SQL statement would be:
You could also use wildcards in a select
command. For example, the following will return all the players
that have a last name of "Johnson":
select *
from Player_Info
where Player LIKE '% Johnson';
This will match a string ending with "Johnson".
Let's suppose that
Shaquille O'Neal won a championship. We need to update our database
to reflect this. This is how it can be done:
update Player_Info
set Championships = 1
where Player = 'Shaquille O'Neal';
SQL also has methods to modify entire columns. After every
basketball season, we need to increment the Years
column by one:
update Player_Info
set Years = (Years + 1);
Now, let's discuss insertion into a table. Say we need to
add another player to the database. We could do it this way:
insert into Player_Info
values
('Hakeem Olajuwon', 10, 27, 11, 4, 2);
As you can see, it is very simple to insert an element into
the table. However, if you have a database with a large number of
columns, and you want to insert a row into the table, you can manually
specify the columns:
insert into Player_Info
(Player, Years, Points, Rebounds, Assists, Championships)
values
('Hakeem Olajuwon', 10, 27, 11, 4, 2);
When used in this context, the order of the fields does not
necessarily have to match the order in the database, as long as
the fields and the values specified match each other.
If you wanted
to delete "John Stockton" from the database, you could do this:
delete from Player_Info
where Player = 'John Stockton';
SQL also allows you remove entire columns. You should be very
careful when attempting such a move. Instead, it is much safer to
create another database, and copy only the columns you want to the
new database. Here is how you would remove a column:
alter table Player_Info
drop column Championships;
If you want to delete all the records in the table, the following
statement is used:
And finally, the drop table command deletes
the entire database:
(For more information on SQL, see the reference guide on SQL-92
at https://sunsite.doc.ic.ac.uk/packages/perl/db/refinfo/sql2/sql1992.txt).
Never heard of
Sprite?
That is because I developed it for this book. It is a Perl 5 module
that allows you to manipulate text-delimited databases (all data
and delimiters are text) using a small but important subset of SQL-92.
I offer Sprite so you can create your own databases and access them
in CGI scripts, even if you do not have a database product like
Sybase or Oracle. See Appendix E, Applications, Modules, Utilities, and Documentation for information on where
you can get Sprite.
If you do have a commercial product, you can use techniques
like those shown here to issue SQL commands. We will use some Perl
interfaces to Oracle and Sybase later in the chapter. Let's look
at an example.
Let's assume that you have a text file that contains a list
of your company's employees, as well as some information about them:
Last,First,Job_Title,Department,EMail,Phone
Supra,John,System Operator,Systems,jsupra,(617) 555-1578
Painton,Todd,Network Engineer,Systems,tpainton,(617) 555-6530
Martin,Robert,Sales Representative,Sales,martinr,(617) 555-7406
Levine,Julia,Administrative Assistant,Administration,julia,(617) 555-3056
Keenan,Jeff,Manager,Software,jeffk,(617) 555-7769
Nets,Laurie,Group Leader,Development,lnets,(617) 555-9962
The first line of the file contains the field names (delimited
by commas). This is all you need to use the database. Unlike other
databases that store the data in a unique (and strange) format,
Sprite operates on plain text.
Here is the form that will act as the front end to the database:
<HTML>
<HEAD><TITLE>CGI Corporation</TITLE></HEAD>
<BODY>
<H1>Employee Database</H1>
Welcome to the CGI Corporations's Employee Search Form. You can use
this to find information about one of our employee.
Enter as much information as possible to narrow down the search.
<HR>
<FORM ACTION="/cgi-bin/db_phone.pl" METHOD="POST">
<PRE>
Last Name: <INPUT TYPE="text" NAME="Last" SIZE=40>
First Name: <INPUT TYPE="text" NAME="First" SIZE=40>
Job Title: <INPUT TYPE="text" NAME="Job_Title" SIZE=40>
Department: <INPUT TYPE="text" NAME="Department" SIZE=40>
EMail Address: <INPUT TYPE="text" NAME="EMail" SIZE=40>
Phone Number: <INPUT TYPE="text" NAME="Phone" SIZE=40>
</PRE>
<INPUT TYPE="submit" VALUE="Submit the search">
<INPUT TYPE="reset" VALUE="Clear all fields">
</FORM>
<HR>
</BODY></HTML>
The form is shown in Figure 9.5.
Now, let's build the CGI application that will decode the
form information, process the user's query, and create a document
displaying the results, as seen in Figure 9.6.
The program begins:
#!/usr/local/bin/perl5
use Sprite;
$webmaster = "shishir\@bu\.edu";
$query = undef;
The use
command instructs Perl to load the module (or extension). You can
load more than one module at a time. For example, if we wanted to
create dynamic GIF images from the data contained in a database,
we would have to load both the GD and the Sprite
modules:
To continue with the program:
&parse_form_data(*FORM);
$fields = '(Last|First|Job_Title|Department|EMail|Phone)';
The form data is decoded. The parse_form_data
subroutine used in this program is the one we've been using throughout
this book. The fields variable contains a list
of all the fields in the form. You might wonder why we would need
to have such a list when then the parse_form_data
subroutine decodes all the fields in the form. The reason for this
is to make sure that only valid fields are processed, as the search
query is dynamically created from the user-specified information.
Remember, forms are very insecure; a cracker can download a form,
edit it, add an extra field, and submit the form to the program.
If the program is not carefully designed, we could have a major
problem!
foreach $key (keys %FORM) {
if ( ($key !~ /\b$fields\b/o) || ($FORM{$key} =~ /[^\w\-\(\) ]/) ) {
&return_error (500, "CGI Corporation Employee Database Error",
"Invalid Information in Form.");
The foreach construct iterates through
all of the fields stored in the FORM associative array, and checks
for two things, represented by the two expressions separated by
the || operator. First, the field is checked against the list stored
in the fields variable for validity. Second,
it makes sure the information entered by the user is constrained
to the following characters: A-Z, a-z, 0-9, (, ), and the space
character. This ensures that no shell metacharacters are passed.
} else {
$FORM{$key} =~ s/(\W)/\\$1/g;
if ($FORM{$key}) {
$query = join (" and ", $query, "($key =~ /$FORM{$key}/i)");
}
}
}
The conditional is executed if the field is valid. It checks
to see if any information was entered in the field. If there is
information, a query is built by joining each field and value with
"and". You would normally have to escape the "/" character if you
are using the regular expression search in Sprite. In this case,
you don't need to because the user cannot enter "/" in any search
field.
Once the loop terminates, a query might look something like
the following:
and (Last =~ /Martin/i) and (First =~ /Robert/i) and (Department =~ /Sales/i)
The reason the query has an "and" at the beginning has to
do with the way in which the query was created. If you look back
at the join command, you can see that the information
stored in the query variable is concatenated
to a combination of a key and a value with "and", and is finally
stored in query. Remember, $query
will be undefined the first time through the loop, and thus will
end up with an "and" at the beginning. Let's remove the unwanted
initial string.
if ($query) {
$query =~ s/^ and //;
} else {
&return_error (500, "CGI Corporation Employee Database Error",
"No query was entered.");
}
If the user failed to enter any information, an error message
is displayed. Otherwise, the "and" at the beginning of the query
is removed to create a normal query:
(Last =~ /Martin/i) and (First =~ /Robert/i) and (Department =~ /Sales/i)
Note that Sprite allows you to use regular expression operators
to search for data. If the user entered "M" in the last name field,
this program instructs the database to return all records that contain
the letter "M" (or "m", as the "i" flag indicates case insensitivity).
There are cases when this is not desirable. In such cases, you would
need to modify the way the query is joined:
$FORM{$key} = s/(['"])/\\$1/g;
$query = join (" and ", $query, "($key = '$FORM{$key}')");
This will return only exact matches. Since the value in the
field is a string, you need to enclose $FORM{$key}
in single quotes and escape all other quotes (or Sprite will return
an error).
$rdb = new Sprite ();
$rdb->set_delimiter ("Read", ",");
This is some object-oriented Perl syntax that you saw in Chapter 6, Hypermedia Documents. A new database object is
created, and the reference to it is stored in the variable rdb.
The set_delimiter function sets the delimiter
for the data stored in the database. The set_delimiter
function takes two arguments. In the first, we specify that we are
reading from the database. In the second, we specify the comma as
the field delimiter (so we have to know what the data file looks
like).
@data = $rdb->sql (<<End_of_Query);
select * from phone.db
where $query
End_of_Query
The query is passed to Sprite with the sql function. In this case,
a here document is used to pass the query (so it looks readable
to humans). You could just as easily do this:
@data = $rdb->sql ("select * from phone.db where $query");
Sprite returns the matched records as an array, with all the
fields in each record joined by the null character "\0". However,
the first element of the array is not a record, but a flag indicating
success or failure. For instance, if you passed the following query:
select * from phone.db where (Department =~ /Systems/i)
the array would look like this:
$data[0] = 1
$data[1] = Supra\0John\0System Operator\0Systems\0jsupra\0(617) 555-1578
$data[2] = Painton\0Todd\0Network Engineer\0Systems\0tpainton\0(617) 555-6530
A value of 1 indicates success, while a 0 indicates failure.
$status = shift (@data);
$no_elements = scalar (@data);
The shift statement removes the first
element of the array and stores it in the variable status.
Then scalar is used to determine the number
of elements in the array. You can also evaluate the array in a scalar
context, without using the scalar command:
This is the same as using the scalar
command, but different from:
This returns the index of the last element of the array (so
in most cases, it would have a value one less than the number of
elements, as arrays are zero-based).
if (!$status) {
&return_error (500, "CGI Corporation Employee Database Error",
"Sprite Database Error!");
} elsif (!$no_elements) {
&return_error (500, "CGI Corporation Employee Database Error",
"The record you specified does not exist.");
Two things are checked: the error status and the number of
records returned by Sprite. If either the status is 0 or no records
were returned, an error is displayed.
} else {
print <<End_of_HTML;
Content-type: text/html
<HTML>
<HEAD><TITLE>CGI Corporation Employee Directory</TITLE></HEAD>
<BODY>
<H1>CGI Corporation Employee Directory</H1>
<HR><PRE>
End_of_HTML
This code is executed if valid records were returned by Sprite.
We are now formatting the output for display. One of Perl's original
attractions was the report-generating features it offered; Larry
Wall even said that the "rl" in Perl stood for "Reporting Language."
We will use some of those powerful features here. What we have to
do is create a format and assign it to the $~
variable. Then, whenever we issue a write statement,
Perl will print the data according to the format.
The "HEADING" format is selected to display header information.
$~ = "EACH_ENTRY";
foreach (@data) {
s/([^\w\s\0])/sprintf ("&#%d;", ord ($1))/ge;
($last, $first, $job, $department, $email, $phone) =
split (/\0/, $_, 6);
write;
}
print "</PRE>", "\n";
print "<HR>";
print "</BODY></HTML>", "\n";
}
The "EACH_ENTRY" format is selected to display each record
from the phone database. The foreach loop iterates
through each record, splits it into the different fields, and issues
a write to display the data. Note that no variable
was supplied as part of the foreach loop. Normally,
we would have something like this:
foreach $record (@data) {
.
.
.
}
Since we did not supply a variable, Perl automatically places
it in its default variable: $_.
$rdb->close ();
exit (0);
Finally, the database is closed, and the script terminates.
Now, let's look at the two format statements:
format HEADING =
Last First Job Title Department EMail Phone
---- ----- --------- ---------- ----- -----
.
This is a simple one! It is used as a header to display all
of the fields. The period on a line by itself terminates the format.
format EACH_ENTRY =
@<<<<<<<< @<<<<<<<< @<<<<<<<<<<<< @<<<<<<<<<< @<<<<<<<<< @<<<<<<<<<<<<<
$last, $first, $job, $department, $email, $phone
.
This one is a little more complex. The "@<<<<<<<<"
indicates an eight-character, left-justified field holder. The value
stored in the variable, which is listed below a field holder, is
displayed each time a write is called. This
will allow for a neat and clean display, as shown in Figure 9.7.
A CGI program is not limited to just reading information from
a database; it can also manipulate the information. Here is a CGI
program that can read, modify, and delete a database consisting
of student information. Before we go any further, let's look at
the supporting HTML documents:
<HTML>
<HEAD><TITLE>Welcome to CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>Student Database</H1>
You can use a combination of forms and CGI to access and modify information in
the student database. Please choose one of the following options:
<HR>
<A HREF="/add.html">Add New Student</A><BR>
<A HREF="/modify.html">Modify Student Information</A><BR>
<A HREF="/view.html">View Student Information</A><BR>
<A HREF="/delete.html">Delete Student</A><BR>
<HR>
</BODY>
</HTML>
This is the initial document containing links to the various
forms that allow the user to view, add, modify, and delete information
from the student database.
<HTML>
<HEAD><TITLE>Welcome to CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>Add New Student</H1>
<HR>
<FORM ACTION="/cgi-bin/student.pl?add" METHOD="POST">
<PRE>
Student Name: <INPUT TYPE="text" NAME="Student" SIZE=40>
Year of Graduation: <INPUT TYPE="text" NAME="YOG" SIZE=4 MAXLENGTH=4>
Address (Mailing Information):
<TEXTAREA NAME="Address" ROWS=4 COLS=40></TEXTAREA>
</PRE>
<INPUT TYPE="submit" VALUE="Add New Student">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY></HTML>
This is the form used to add information into the database.
When the user submits this form, a query of "add" is sent to the
CGI program.
<HTML>
<HEAD><TITLE>Welcome to CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>Modify Student Information</H1>
<HR>
<FORM ACTION="/cgi-bin/student.pl?modify_form" METHOD="POST">
Student Name: <INPUT TYPE="text" NAME="Student" SIZE=40>
<P>
<INPUT TYPE="submit" VALUE="Modify Student Information">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
This form allows the user to modify information for a particular
student. When this form is submitted, the program builds and displays
another form dynamically. Here is the form used to view the results
of a specified query.
<HTML>
<HEAD><TITLE>Welcome to CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>View Student Information</H1>
<HR>
<FORM ACTION="/cgi-bin/student.pl?view" METHOD="POST">
Student Name: <INPUT TYPE="text" NAME="Student" SIZE=40>
<P>
Year of Graduation:
<INPUT TYPE="radio" NAME="Sign" VALUE="greater"> Greater Than
<INPUT TYPE="radio" NAME="Sign" VALUE="equal" CHECKED> Equal To
<INPUT TYPE="radio" NAME="Sign" VALUE="less"> Less Than
<INPUT TYPE="text" NAME="YOG" SIZE=4 MAXLENGTH=4>
<P>
Address Information: <INPUT TYPE="text" NAME="Address" SIZE=40>
<P>
<INPUT TYPE="submit" VALUE="View Student Information">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
This form is used to view records that match certain criteria.
The user can select records based on a conditional year of graduation
(either greater than, less than, or equal to a certain year). We
could have just as easily allowed mathematical operators (>, <,
and =) to be entered, but this can be a potential security hole,
as some of them have a special meaning to the shell (i.e., shell
metacharacters). It is far better and safer to use strings like
"equal", "greater", and "less", and let the CGI program convert
them to the appropriate operators when creating a query.
<HTML>
<HEAD><TITLE>Welcome to CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>Delete Student</H1>
<HR>
<FORM ACTION="/cgi-bin/student.pl?delete" METHOD="POST">
<PRE>
Student Name: <INPUT TYPE="text" NAME="Student" SIZE=40>
Year of Graduation: <INPUT TYPE="text" NAME="YOG" SIZE=4 MAXLENGTH=4>
</PRE>
<INPUT TYPE="submit" VALUE="Delete Student">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
A user can use this form to delete information from the database.
In this case, only the student name and year of graduation fields
are presented. Records for an entire class can be deleted by specifying
the year of graduation, and leaving the name
field empty. You should not normally allow such a dangerous option!
However, it is shown here to illustrate the power of databases and
the Web.
Now, let's look at the CGI program that works with these
forms.
#!/usr/local/bin/perl5
use Sprite;
$query_string = $ENV{'QUERY_STRING'};
$script = $ENV{'SCRIPT_NAME'};
$request_method = $ENV{'REQUEST_METHOD'};
$webmaster = "shishir\@bu\.edu";
$database = "/home/shishir/student.db";
$main_form = "/student.html";
$commands = '(add|modify_form|modify|view|delete)';
The Sprite module
is loaded. The full path to the student database and the relative
path to the main HTML document (the one that
contains links to the other forms) are stored in the database,
and main_form variables, respectively. Finally,
commands contains a list of the valid queries
that forms can pass to this program. If you look carefully at the
list, you will notice that none of the forms listed above passes
the "modify" query. The form that passes this is dynamically created
by this program, as you will later see.
$delimiter = "::";
$error = "CGI Student Database Error";
Fields in the student database are delimited by the "::" characters.
if ($query_string =~ /^\b$commands\b$/) {
&parse_form_data (*DB);
If the query is valid, the POST form data is decoded and placed
in the DB associative array. (As always, the
parse_form_data subroutine used in this program
is the one we've been using throughout all our examples.)
&check_all_fields ();
&check_database ();
The check_all_fields subroutine iterates
through the DB associative array to ensure
that there are no shell metacharacters. The check_database
subroutine checks to see if the student database exists. If not,
a new one is created.
$rdb = new Sprite ();
$rdb->set_delimiter ("Read", $delimiter);
$rdb->set_delimiter ("Write", $delimiter);
A new database object is created. The set_delimiter
function sets the delimiter to be used when reading from and writing
to a database.
$command_status = &$query_string ();
This is a construct that you may not have seen before. The
subroutine corresponding to the value stored in query_string
is called. It is equivalent to saying:
if ($query_string eq "add") {
$command_status = &add ();
} elsif ($query_string eq "modify_form") {
$command_status = &modify_form ();
} elsif ($query_string eq "modify") {
$command_status = &modify ();
} elsif ($query_string eq "view") {
$command_status = &view ();
} elsif ($query_string eq "delete") {
$command_status = &delete ();
}
How convenient! Now, let's continue on with the program.
if ($command_status) {
$rdb->close ($database);
print "Location: ", $main_form, "\n\n";
} else {
$rdb->close ();
}
Depending on the status returned from one of the subroutines
above, a server redirect is done with the Location:
header. There is a subtle difference between the two $rdb->close
subroutines. If you specify a database as part of the close
subroutine, the modifications performed on that database are saved.
Otherwise, the changes are discarded.
} else {
&return_error (500, $error,
"Invalid command passed through QUERY_STRING.");
}
exit (0);
If an invalid query was passed to this program, an error is
returned.
The following subroutine checks to see if the database exists.
If it does not, a new database is created, and a header line containing
the field names, delimited by "::", is output.
sub check_database
{
local ($exclusive_lock, $unlock, $header);
$exclusive_lock = 2;
$unlock = 8;
if (! (-e $database) ) {
if ( open (DATABASE, ">" . $database) ) {
flock (DATABASE, $exclusive_lock);
$header = join ($delimiter, "Student", "YOG", "Address");
print DATABASE $header, "\n";
flock (DATABASE, $unlock);
close (DATABASE);
} else {
&return_error (500, $error, "Cannot create new student database.");
}
}
}
The check_all_fields subroutine makes
sure the form elements do not contain shell meta-characters:
sub check_all_fields
{
local ($key);
foreach $key (keys %DB) {
if ($DB{$key} =~ /[`\!;\\\*\\$[amp ][lt ][gt ]]/) {\n [amp ]\|return_error (500, $error,
"Invalid characters in the [$key] field.");
}
}
}
The subroutine iterates through the DB
associative array checking to make sure that none of the elements
contains any dangerous shell metacharacters. If any are found, an
error message is displayed.
sub build_check_condition
{
local ($columns) = @_;
local ($all_fields, $loop, $key, $sign, $sql_condition);
This is a very useful subroutine that dynamically builds a
query. It expects a string in the following format:
From this, the following query is constructed (assuming that
the user entered "Ed Surge" in the student field, and "Elm Street"
in the address field):
(Student = 'Ed Surge') and (Address =~ 'Elm Street')
(You might have noticed that the regular expression is not
the usual format ($string =~ /abc/). You are correct! However, Perl
accepts this format as well.)
@all_fields = split (/,/, $columns);
The all_fields array consists of successive
elements of the field name, followed by the operator that should
be used to search that field. In this example, the array would look
like this:
$all_fields[0] = "Student";
$all_fields[1] = "=";
$all_fields[2] = "Address";
$all_fields[3] = "=~";
Now, let's look at the loop that iterates through this array
to build the query.
for ($loop=0; $loop <= $#all_fields; $loop = $loop + 2) {
$key = $all_fields[$loop];
$sign = $all_fields[$loop + 1];
The key and the sign
variables consist of the field name and the operator, respectively.
if ($DB{$key}) {
$DB{$key} =~ s/([\W])/\\$1/g;
$sql_condition = join (" and ", $sql_condition,
"( $key $sign '$DB{$key}' )", );
}
}
The query is built in nearly the same manner as in the preceding
example, except that the operator can be different for each field.
if ($sql_condition) {
$sql_condition =~ s/^ and //;
return ($sql_condition);
} else {
&return_error (500, $error, "No query was entered.");
}
}
If the user did not enter any information into the fields,
an error message is displayed. Otherwise, the dynamically created
query is returned (to the subroutine that called).
This is a very simple subroutine (if you can call it that)
that returns an error.
sub database_error
{
&return_error (500, $error,
"Sprite database error. Please check the log file.");
}
The only reason this statement was placed in a subroutine
is for convenience. For example, it is much shorter and quicker
to say:
$rdb->update (<<Update_Command) || &database_error ();
than to say:
$rdb->update (<<Update_Command) || &return_error (500, $error,
"Sprite database error. Please check the log file.");
This is especially true if the same error needs to be returned
for various problems.
The check_select_command subroutine is
generally used after an SQL "select" statement. It checks the first
element of the returned data, as well as the number of records returned,
and displays an error if either of these values equal 0. Otherwise,
a status of 1 is returned.
sub check_select_command
{
local ($value, $no_elements) = @_;
if (!$value) {
&database_error ();
} elsif (!$no_elements) {
&return_error (500, $error,
"The record you specified does not exist.");
} else {
return (1);
}
}
The add subroutine inserts a record into
the database.
sub add
{
$DB{'Address'} =~ s/\n/<BR>/g;
$DB{'Address'} =~ s/(['"])/\\$1/g;
$DB{'Student'} =~ s/(['"])/\\$1/g;
$rdb->sql (<<End_of_Insert) || &database_error ();
insert into $database
(Student, YOG, Address)
values
('$DB{'Student'}', '$DB{'YOG'}', '$DB{'Address'}')
End_of_Insert
return (1);
}
All newline characters are converted to "<BR>" and all
single and double quotes are escaped. Remember, all records in a
text-delimited database are delimited by newline characters! This
ensures that the data will be correctly displayed by the browser
when the user decides to view it.
The format for the "insert" SQL statement is the same as described
in the SQL primer earlier. If the record could not be inserted into
the database, an error is returned. Otherwise, a status of 1 is
returned. This instructs the script to save the database and perform
a server redirect to display the main HTML document.
Now for the most complicated action--modifying a row.
sub modify_form
{
local (@info, $modify_status, $no_elements, $status);
$DB{'Student'} =~ s/(['"])/\\$1/g;
@info = $rdb->sql (<<End_of_Select);
select * from $database
where (Student = '$DB{'Student'}')
End_of_Select
$status = shift (@info);
$no_elements = scalar (@info);
$modify_status = &check_select_command ($status, $no_elements);
if ($modify_status) {
&display_modify_form ($info[0]);
}
return (0);
}
This subroutine performs two actions. First, it uses the student's
name, as specified in the modify form (shown with the other forms
at the beginning of this section), to retrieve the record for that
student. The check_select_command subroutine
ensures that data was returned by the database. Second, display_modify_form
is called (with the first record in the array as an argument) to
display a new form that contains all of the information about the
student. The user can then modify the data in the form and submit
it.
A status of 0 is returned by this subroutine. As a result,
the database is not saved (which is what we want, since it was not
modified), and no server redirection is performed.
The display_modify_form subroutine returns
a form for changing a student's record.
sub display_modify_form
{
local ($fields) = @_;
local ($student, $yog, $address);
($student, $yog, $address) = split (/\0/, $fields);
$address =~ s/<BR>/\n/g;
$student = &escape_html ($student);
$yog = &escape_html ($yog);
The record that is passed to this subroutine by modify_form is split on the
"\0" delimiter, and the "<BR>" characters are converted back
to newlines. In addition, we call the escape_html
subroutine to "escape" characters that have a special significance
to the browser, such as the double quote, "<", ">", and "&".
We perform these steps so that the information is displayed properly.
print <<End_of_Modify_Form;
Content-type: text/html
<HTML>
<HEAD><TITLE>CGI Educational Center</TITLE></HEAD>
<BODY>
<H1>Modify Student Information</H1>
<HR>
<B>Student Name: $student</B>
<P>
<FORM ACTION="$script?modify" METHOD="POST">
<INPUT TYPE="hidden" NAME="Student" VALUE="$student">
Year of Graduation:
<INPUT TYPE="text" NAME="YOG" SIZE=4 MAXLENGTH=4 VALUE="$yog">
<P>
Address (Mailing Information):
<TEXTAREA NAME="Address" ROWS=4 COLS=40>
$address
</TEXTAREA>
<P>
<INPUT TYPE="submit" VALUE="Modify Record For: $student">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
End_of_Modify_Form
}
The form containing the information for the specified student
is output. The user can now modify this form. We use $student
twice: once to remind the user which student was chosen, and once
to pass the name back to this CGI program so it modifies the right
row. The form is shown in Figure 9.8.
The escape_html subroutine escapes certain
characters so that they are displayed correctly by the browser.
sub escape_html
{
local ($string) = @_;
local (%html_chars, $html_string);
%html_chars = ('&', '&',
'>', '>',
'<', '<',
'"', '"');
$html_string = join ("", keys %html_chars);
$string =~ s/([$html_string])/$html_chars{$1}/go;
return ($string);
}
The view subroutine shows a student's
current record.
sub view
{
local ( $fields, $query, @students, $view_status, $status, $no_elements);
$fields = 'Student,=,Address,=~';
if ($DB{'YOG'}) {
if ($DB{'Sign'} eq 'greater') {
$DB{'Sign'} = '>';
} elsif ($DB{'Sign'} eq 'less') {
$DB{'Sign'} = '<';
} else {
$DB{'Sign'} = '=';
}
$fields = join (",", $fields, 'YOG', $DB{'Sign'});
}
$query = &build_check_condition ($fields);
If the user entered information into the year of graduation
field, the search operator is determined. This is then appended
to the value stored in the fields
variable. The build_check_condition subroutine
is called to dynamically construct the search query.
@students = $rdb->sql (<<End_of_Display);
select * from $database
where $query
End_of_Display
$status = shift (@students);
$no_elements = scalar (@students);
$view_status = &check_select_command ($status, $no_elements);
The query is passed to the select command.
The information returned by the database is checked for possible
errors. If there are no errors, view_status
contains the value of 1.
if ($view_status) {
&display_results ("View Students", *students);
}
return (0);
}
If the data returned by the database is valid, the display_results
subroutine is called to display the search results. The two arguments
passed to the subroutine are the header for the HTML
document and the reference to the array that contains the results.
sub display_results
{
local ($title, *data) = @_;
local ($student, $yog, $address);
print "Content-type: text/html", "\n";
print "Pragma: no-cache", "\n\n";
print "<HTML>", "\n";
print "<HEAD><TITLE>CGI Educational Center</TITLE></HEAD>";
print "<BODY>", "\n";
print "<H1>", $title, "</H1>";
print "<HR>";
The Content-type: and Pragma:
MIME headers are output. We do not want the browser
to cache the page containing the results. As a result, the displayed
data reflects the true status of the database.
foreach (@data) {
s/([^\w\s\0])/sprintf ("&#%d;", ord ($1))/ge;
($student, $yog, $address) = split ("\0", $_, 3);
$student = "NULL" if (!$student);
$yog = "Unknown graduation date" if (!$yog);
$address = "No address specified" if (!$address);
If any of the fields for a record are null, certain default
values are used, so as not to display empty fields.
$address =~ s/<BR>/<BR>/g;
print "<BR>", "\n";
print "<B>", $student, "</B> ", "($yog)", "<BR>", "\n";
print $address, "<BR>", "\n";
}
print "<HR>", "\n";
print "</BODY></HTML>", "\n";
}
The foreach loop iterates through the
matched records, and displays them.
The delete subroutine removes records
from the database.
sub delete
{
local ($fields, $query);
$fields = 'Student,=,YOG,=';
$query = &build_check_condition ($fields);
$rdb->sql (<<End_of_Delete) || &database_error ();
delete from $database
where $query
End_of_Delete
return (1);
}
Multiple records can be deleted by leaving the student field
empty, but entering a valid year for the YOG
field. If the specified records cannot be deleted, an error message
is displayed.
Existing records are modified with the modify subroutine.
sub modify
{
local (@fields, $key);
@fields = ('YOG', 'Address');
$DB{'Address'} =~ s/\n/<BR>/g;
$DB{'YOG'} =~ s/(['"])/\\$1/g;
$DB{'Student'} =~ s/(['"])/\\$1/g;
$DB{'Address'} =~ s/(['"])/\\$1/g;
foreach $key (@fields) {
$rdb->sql (<<Update_Database) || &database_error ();
update $database
set $key = ('$DB{$key}')
where (Student = '$DB{'Student'}');
Update_Database
}
return (1);
}
The current version of Sprite does not support multiple fields
in a update statement. As a result, a loop
is used to update the record multiple times. If the user entered
"1991" in the year of graduation field, and "Elm Street, 02215"
in the address field, the two update statements
are generated:
update /home/shishir/student.db
set YOG = ('1991')
where (Student = 'Ed Surge')
update /home/shishir/student.db
set Address = ('Elm Street, 02215')
where (Student = 'Ed Surge')
That concludes the section on Sprite.
Now, let's
look at CGI gateways to the two most popular commercial databases:
Oracle and Sybase. Each of these is supported by Perl and Tcl extensions
that make our job much easier by letting us submit SQL queries that
the database recognizes. I will use Oracle and Sybase to illustrate
two different ways to display the results of a query. In this section,
I will query an Oracle database and plot the data using gnuplot
(available from ftp://prep.ai.mit.edu/pub/gnu/gnuplot-3.5.tar.gz).
In the next section, I will use Sybase and display the results in
a table using HTML.
Suppose you have a database consisting of stock trading
history for various companies over an eleven-year span (from 1980
through 1990). A sample table is shown below:
You would like to present this valuable source of information
as crisp graphs or plots to the general public. How would you go
about doing it? The first step is to create a form where the user
can enter a company's identification:
<HTML>
<HEAD><TITLE>Welcome to CGI Stock Service</TITLE></HEAD>
<BODY>
<H1>Stock Quotes</H1>
<HR>
<FORM ACTION="/cgi-bin/stocks.pl" METHOD="GET">
<EM>Please enter the name of the stock that you would like to
get a quote for:</EM>
<P>
<INPUT TYPE="text" NAME="Company_ID" SIZE=10 MAXLENGTH=10>
<P>
<INPUT TYPE="submit" VALUE="Look Up This Stock">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
The second step is to write a CGI program that sends the query
to the database, retrieves the results, and utilizes gnuplot
to graph the information. Here is the CGI program that interacts
with the Oracle database using oraperl:
#!/usr/local/bin/oraperl
require "oraperl.ph";
oraperl
is a derivative of Perl that contains functionality to access and
interact with Oracle databases. As of Perl 5, the DBperl
extensions (a.k.a. DBI) supersede most of the Perl 4.0 database
derivatives (such as oraperl and sybperl).
For information on where to get oraperl, syperl,
and DBperl, see Appendix E, Applications, Modules, Utilities, and Documentation.
$| = 1;
$webmaster = "shishir\@bu\.edu";
$gnuplot = "/usr/local/bin/gnuplot";
$ppmtogif = "/usr/local/bin/pbmplus/ppmtogif";
Buffering is turned off, and the full path to the gnuplot
and ppmtogif commands is defined. (See Chapter 6, Hypermedia Documents for other examples of how these commands are used.)
&parse_form_data (*DB);
($company_id = $DB{'Company_ID'}) =~ s/^\s*(.*)\b\s*$/$1/;
The form information is decoded. In this case, we are dealing
with only one field (Company_ID). The information
stored in this field is equated to the company_id
variable, and the leading and trailing spaces are removed.
if ($company_id =~ /^\w+$/) {
If the field value is an alphanumeric character (A-Z, a-z,
0-9, _), the program continues. Otherwise, an error message is returned.
We want to make sure that only the characters that we need are allowed!
In this case, shell metacharacters are not allowed to pass through.
$process_id = $$;
$output_ppm = join ("", "/tmp/", $process_id, ".ppm");
$data_file = join ("", "/tmp/", $process_id, ".txt");
We need two temporary files in this program. To make sure
that each running instance of the program uses unique temporary
files, we borrow a trick from UNIX shell scripting and put our
process
identification number (PID) into the names. Each time the program
runs, it has to have a unique PID, so we know we will not clobber
our own temporary file. The output_ppm and
data_file variables contain the full file specification
for the temporary files that will be created by this program. The
current process id number ensures unique filenames.
The color number of 1 indicates Red. This is the color of
the plot line.
$system_id = "Miscellaneous";
$username = "shishir";
$password = "fnjop673e2nB";
The Oracle
system identification (SID), the username,
and the password are set. You might wonder if it is safe to hard-code
the database password into this program. The answer to that depends
on how the database is set up. In cases like this, you should create
a generic user, such as "guest," with minimal access rights (read-only),
so that there is no danger to the database.
$lda = &ora_login ($system_id, $username, $password);
The ora_login subroutine is used to log
in to the database. The value returned is the login identifier,
also referred to as the Oracle Login Data Area. This identifier
will be used to execute an SQL command.
$csr = &ora_open ($lda, " select * from Stocks where ID = '$company_id' ");
The ora_open subroutine executes a specified
SQL command. It requires a login identifier, and returns a statement
identifier or an
Oracle Cursor.
This statement identifier is needed to retrieve the actual data
(resulting from the SQL command).
You are not limited to specifying the SQL command on one line;
you can use the block notation:
$csr = &ora_open ($lda, <<End_of_Select);
select * from Stocks
where ID = '$company_id'
End_of_Select
Let's continue with the rest of the program.
if ( open (DATA, ">" . $data_file) ) {
($company_id, $company, @stock_prices) = &ora_fetch ($csr);
The ora_fetch subroutine retrieves the
information returned by the SQL select command.
The first two fields (or columns) are stored in company_id
and company, respectively. The rest of the
columns, however, are stored in the stock_prices
array. This consists of the 11 columns representing 11 years, as
shown in the previous table.
&ora_close ($csr);
&ora_logoff ($lda);
The statement identifier is released with the ora_close
subroutine, and the database is closed.
This block of code is executed only if a database record matched
the user's selection. Otherwise, an error message is returned.
$stocks_start = 1980;
$stocks_end = 1990;
$stocks_duration = $stocks_end - $stocks_start;
for ($loop=0; $loop <= $stocks_duration; $loop++) {
$price = $stock_prices[$loop];
$year = $stocks_start + $loop;
print DATA $year, " ", $price, "\n";
}
close (DATA);
The loop iterates 11 times to create a data file with all
of the year/stock price pairs. For example, here is how the data
file would look like if the user selected "Fah":
1980 37.3
1981 40.4
1982 38.2
.
.
.
When we build our plot, the first column provides data for
the x axis, while the second column provides data for the y axis.
&graph_data ("Stock History for $company", $data_file,
"Year", "Price", $color_number, $output_ppm);
The graph_data subroutine is called to
create a PBM file (which is later converted to GIF). The arguments
to this subroutine are the title of the graph, the data file to
use, the label for the X axis, the label for the Y axis, the line
color, and the output file.
&create_gif ($output_ppm);
The final GIF image is created by the create_gif
subroutine, which expects one argument: the name of the PBM file
created by gnuplot.
} else {
&return_error (500, "Oracle Gateway CGI Error",
"The specified company could not be found.");
}
An error message is displayed if the user selected a non-existent
company name.
} else {
&return_error (500, "Oracle Gateway CGI Error",
"Could not create output file.");
}
If the data file could not be created, an error is returned.
} else {
&return_error (500, "Oracle Gateway CGI Error",
"Invalid characters in company field.");
}
exit (0);
Finally, if the information in the form field contains any
non-alphanumeric characters, an error message is sent.
The graph_data subroutine opens a pipe
to the gnuplot numerical analysis program,
and sends a group of format commands through it. The end result
of this is a pbm graphics file, which is later converted to GIF.
sub graph_data
{
local ($title, $file, $x_label, $y_label, $color, $output) = @_;
open (GNUPLOT, "| $gnuplot");
print GNUPLOT <<gnuplot_Commands_Done;
set term pbm color small
set output "$output"
set title "$title"
set xlabel "$x_label"
set ylabel "$y_label"
set noxzeroaxis
set noyzeroaxis
set border
set nokey
plot "$file" w lines $color
gnuplot_Commands_Done
close (GNUPLOT);
}
The create_gif subroutine uses the ppmtogif
utility to convert the pbm file to GIF, for display on the Web (see
Figure 9.9).
sub create_gif
{
local ($output) = @_;
print "Content-type: image/gif", "\n\n";
system ("$ppmtogif $output 2> /dev/null");
unlink $output_ppm, $data_file;
}
Finally, the temporary files are "unlinked," or
deleted.
In this example,
the form input (from the user) is used to access a Sybase database
to look up information on books. Our interface to Sybase is the
sybperl library, which provides Perl subroutines
for giving Sybase queries in the form it can recognize. The data
returned by Sybase is converted to an HTML 3.0
table format. In other words, the output, when displayed on a browser
that recognizes HTML 3.0, resembles a nice table
with solid three-dimensional lines separating the different fields.
<HTML>
<HEAD><TITLE>Welcome to CGI Publishing Company</TITLE></HEAD>
<BODY>
<H1>Book Search</H1>
<HR>
<FORM ACTION="/cgi-bin/books.pl" METHOD="GET">
<EM>Please enter the name of the book that you would like to look up:</EM>
<P>
<INPUT TYPE="text" NAME="Book" SIZE=40>
<P>
<INPUT TYPE="submit" VALUE="Look Up This Book">
<INPUT TYPE="reset" VALUE="Clear the Information">
</FORM>
<HR>
</BODY>
</HTML>
Above is the form that is used to retrieve the input from
the user.
Let's look at the program:
#!/usr/local/bin/sybperl
require "sybperl.pl";
$user = "shishir";
$password = "mkhBhd9v2sK";
$server = $ENV{'DSQUERY'} || "Books";
The user, password, and server name are set. If the environment
variable DSQUERY is defined, the server is set
to the value of that variable. If not, the server is set to "Books".
The following statement:
$server = $ENV{'DSQUERY'} || "Books";
is a simpler of way of doing the following:
if ($ENV{'DSQUERY'}) {
$server = $ENV{'DSQUERY'};
} else {
$server = "Books";
}
Next, the dblogin subroutine is used
to log in to the Sybase server.
$dbproc = &dblogin ($user, $password, $server);
dblogin returns the identification for
the newly created database process into the dbproc
variable.
@fields = ('Author', 'Book', 'Publisher', 'Year', 'Pages');
$title = "CGI Publishing Company Book Database";
The fields array holds a list of all
the fields in a record. The title variable
contains the title of the HTML 3.0 table.
&parse_form_data (*DB);
($book_name = $DB{'Book'}) =~ s/^\s*(.*)\b\s*$/$1/;
Leading and trailing spaces are removed from the Book
field.
if ($book_name =~ /^[\w\s]+$/) {
Since we are dealing with book names, the user is allowed
to enter only the following characters: A-Z, a-z, 0-9, _, and whitespace.
If any other characters are entered, an error message is returned.
To retrieve data from a Sybase database, you attach to the
database, execute a query, and then loop through the returned data
one row at a time. These standard steps are performed in this CGI
application.
&dbcmd ($dbproc, " select * from Catalog where Book = '$book_name' ");
&dbsqlexec ($dbproc);
$status = &dbresults ($dbproc);
The dbcmd subroutine associates the SQL
command with the current database process (dbproc). The dbsqlexec subroutine executes
the SQL command, while the dbresults make the
data available to the program. The dbresults
subroutine returns either "$SUCCEED" or "$FAIL" (these are variables
that are special to sybperl).
if ($status == $SUCCEED) {
while ( (@books = &dbnextrow ($dbproc)) ) {
$book_string = join ("\0", @books);
push (@all_books, $book_string);
}
If the user-specified records are found, the dbresults
subroutine returns "$SUCCEED". The while
loop iterates through all of the data by calling the dbnextrow
subroutine each time through the loop, in case there is more than
one book that matches the criteria. The books
array consists of information in the following format (for a sample
book);
$books[0] = "Andy Oram and Steve Talbott"
$books[1] = "Managing Projects with make"
$books[2] = "O'Reilly & Associates, Inc."
$books[3] = 1991
$books[4] = 152
We need to create this intermediate array because that is
the structure of the data returned by dbnextrow.
But what we really want is a single string, because then we could
store all the information on a single book in one element of the
@all_books array. So we use the join
statement to form the following string:
$book_string = "Andy Oram and Steve Talbott\0Managing Projects with make\0O'Reilly & Associates, Inc.\01991\0152"
This string is then pushed into the all_books
array. This process is repeated for all matches.
&dbexit ($dbproc);
&display_table ($title, *fields, *all_books, "\0");
The database is closed by calling the dbexit
subroutine. Finally, the table is displayed by calling a generic
subroutine, display_table. The subroutine expects
the following arguments: the title of the table, the array consisting
of the header (or field) names, the array consisting of the strings,
and the delimiter by which these strings are concatenated.
} else {
&return_error (500, "Sybase Database CGI Error",
"The book title(s) you specified does not exist.");
}
} else {
&return_error (500, "Sybase Database CGI Error",
"Invalid characters in book name.");
}
exit(0);
Error messages are returned if either the specified book name
does not exist, or the input contains invalid characters.
The display_table subroutine prints out
the table.
sub display_table
{
local ($title, *columns, *selected_entries, $delimiter) = @_;
local ($name, $entry);
print "Content-type: text/html", "\n\n";
print "<HTML>", "\n";
print "<HEAD><TITLE>", $title, "</TITLE></HEAD>", "\n";
print "<BODY>", "\n";
print "<TABLE BORDER=2>", "\n";
print "<CAPTION>", $title, "</CAPTION>", "\n";
print "<TR>", "\n";
A MIME type of text/html
is output, along with some HTML 3.0 tags to create
a table.
foreach $name (@columns) {
print "<TH>", $name, "\n";
}
This loop iterates through and displays all of the field headers.
foreach $entry (@selected_entries) {
$entry =~ s/$delimiter/<TD>/go;
print "<TR>", "<TD>", $entry, "\n";
}
print "</TABLE>", "\n";
print "</BODY></HTML>", "\n";
}
The foreach loop iterates through the
matching records, substitutes the delimiter with the
<TD> tag,
and prints out the HTML needed to create a new
row. There is no delimiter before the first item in $entry,
so the print statement supplies the first <TD> tag. Finally,
the table is closed. Figure 9.10 shows what the table looks
like.
|
|