CGI Programming Guide - [Chapter 2] 2.5 Other Languages Under UNIX

2.5 Other Languages Under UNIX

You now know the basics of how to handle and manipulate the CGI input in Perl. If you haven't guessed by now, this book concentrates primarily on examples in Perl, since Perl is relatively easy to follow, runs on all three major platforms, and also happens to be the most popular language for CGI. However, CGI programs can be written in many other languages, so before we continue, let's see how we can accomplish similar things in some other languages, such as C/C++, the C Shell, and Tcl.

C/C++

Here is a CGI program written in C (but that will also compile under C++) that parses the HTTP_USER_AGENT environment variable and outputs a message, depending on the type of browser:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main (void)
{
    char *http_user_agent;
    printf ("Content-type: text/plain\n\n");
    http_user_agent = getenv ("HTTP_USER_AGENT");
    if (http_user_agent == NULL) {
        printf ("Oops! Your browser failed to set the HTTP_USER_AGENT ");
        printf ("environment variable!\n");
    } else if (!strncmp (http_user_agent, "Mosaic", 6)) {
        printf ("I guess you are sticking with the original, huh?\n");
    } else if (!strncmp (http_user_agent, "Mozilla", 7)) {
        printf ("Well, you are not alone. A majority of the people are ");
        printf ("using Netscape Navigator!\n");
    } else if (!strncmp (http_user_agent, "Lynx", 4)) {
        printf ("Lynx is great, but go get yourself a graphic browser!\n");
    } else {
        printf ("I see you are using the %s browser.\n", http_user_agent);
        printf ("I don't think it's as famous as Netscape, Mosaic or Lynx!\n");
    }
    exit (0);
}

The getenv function returns the value of the environment variable, which we store in the http_user_agent variable (it's actually a pointer to a string, but don't worry about this terminology). Then, we compare the value in this variable to some of the common browser names with the strncmp function. This function searches the http_user_agent variable for the specified substring up to a certain position within the entire string.

You might wonder why we're performing a partial search. The reason is that generally, the value returned by the HTTP_USER_AGENT environment variable looks something like this:

Lynx/2.4 libwww/2.14

In this case, we need to search only the first four characters for the string "Lynx" in order to determine that the browser being used is Lynx. If there is a match, the strncmp function returns a value of zero, and we display the appropriate message.

C Shell

The C Shell has some serious limitations and therefore is not recommended for any type of CGI applications. In fact, UNIX guru Tom Christiansen has written a FAQ titled "Csh Programming Considered Harmful" detailing the C Shell's problems. Here is a small excerpt from the document:

The csh is seductive because the conditionals are more C-like, so the path of least resistance is chosen and a csh script is written. Sadly, this is a lost cause, and the programmer seldom even realizes it, even when they find that many simple things they wish to do range from cumbersome to impossible in the csh.

However, for completeness sake, here is a simple shell script that is identical to the first unix.pl Perl program discussed earlier:

#!/bin/csh
echo "Content-type: text/plain"
echo ""
if ($?QUERY_STRING) then
    set command = `echo $QUERY_STRING | awk 'BEGIN {FS = "="} { print $2 }'`
    if ($command == "fortune") then
        /usr/local/bin/fortune
    else if ($command == "finger") then
        /usr/ucb/finger
    else 
        /usr/local/bin/date
    endif
else
    /usr/local/bin/date
endif

The C Shell does not have any inherent functions or operators to manipulate string information. So we have no choice but to use another UNIX utility, such as awk, to split the query string and return the data on the right side of the equal sign. Depending on the input from the user, one of several UNIX utilities is called to output some information.

You may notice that the variable QUERY_STRING is exposed to the shell. Generally, this is very dangerous because users can embed shell metacharacters. However, in this case, the variable substitution is done after the `` command is parsed into separate commands. If things happened in the reverse order, we could potentially have a major headache!

Tcl

The following Tcl program uses an environment variable that we haven't yet discussed up to this point. The HTTP_ACCEPT variable contains a list of all of the MIME content types that a browser can accept and handle. A typical value returned by this variable might look like this:

application/postscript, image/gif, image/jpeg, text/plain, text/html

You can use this information to return different types of data from your CGI document to the client. The program below parses this accept list and outputs each MIME type on a different line:

#!/usr/local/bin/tclsh
puts "Content-type: text/plain\n"
set http_accept $env(HTTP_ACCEPT)
set browser $env(HTTP_USER_AGENT)
puts "Here is a list of the MIME types that the client, which"
puts "happens to be $browser, can accept:\n"
set mime_types [split $http_accept ,]
foreach type $mime_types {
    puts "- $type"
}
exit 0

As in Perl, the split command splits a string on a specified delimiter, placing all of the resulting substrings in an array. In this case, the mime_types array contains each MIME type from the accept list. Once that's done, the foreach loop iterates through the array, displaying each element.