Unix Programming - Language Evaluations - Perl
Perl is shell on steroids. It was specifically designed to
replace
awk(1),
and expanded to replace shell as the ‘glue’ for
mixed-language script programming. It was first released in 1987.
Perl's strongest point is its extremely powerful built-in
facilities for pattern-directed processing of textual, line-oriented
data formats; it is unsurpassed at this. It also includes far stronger
data structures than shell, including dynamic arrays of mixed element
types and a ‘hash’ or ‘dictionary’ type that
supports convenient and fast lookup of name-value pairs.
Additionally, Perl includes a rather complete and
well-thought-out internal binding of virtually the entire Unix API,
drastically reducing the need for C and making it suitable for jobs like
simple TCP/IP clients
and even servers. Another strong advantage of Perl is that a large and
vigorous open-source community has grown up around it. Its home on
the net is the Comprehensive Perl
Archive Network. Dedicated Perl hackers have written hundreds
of freely reusable Perl modules for many different programming
tasks. These include everything from structure-walking of directory
trees through X toolkits for GUI building, through excellent canned
facilities for supporting HTTP robots and CGI programming.
Perl's main drawback is that parts of it are irredeemably ugly,
complicated, and must be used with caution and in stereotyped ways
lest they bite (its argument-passing conventions for functions are a
good example of all three problems). It is harder to get started in
Perl than it is in shell. Though small programs in Perl can
be extremely powerful, careful discipline is required to maintain
modularity and keep a design under control as program size
increases. Because some limiting design decisions early in Perl's
history could not be reversed, many of the more advanced features have
a fragile, klugey feel about them.
The definitive reference on Perl is Programming
Perl [Wall2000]. This book has nearly
everything you will ever need to know in it, but is notoriously badly
organized; you will have to dig to find what you want. A more
introductory and narrative treatment is available in
Learning Perl [Schwartz-Christiansen].
Perl is universal on Unix systems. Perl scripts at the same
major release level tend to be readily portable between Unixes
(provided they don't use extension modules). Perl implementations are
available (and even well documented) for the
Microsoft family of operating systems and on
MacOS
as well. Perl/Tk provides cross-platform GUI capability.
Summing up: Perl's best side is as a power tool for small glue
scripts involving a lot of regular-expression grinding. Its worst
side is that it is ugly, spiky, and nigh-unmaintainable in large
volumes.
The blq script is a tool
for querying block lists (lists of Internet sites that have been
identified as habitual sources of unsolicited bulk email, aka spam).
You can find current sources at the
blq
project page.
blq is a good example of
a small Perl script, illustrating both the strengths and weaknesses of
the language. It makes intensive use of regular-expression matching.
On the other hand, the Net::DNS Perl extension module it uses has to
be conditionally included, because it is not guaranteed to be present
in any given Perl installation.
blq is exceptionally
clean and disciplined as Perl code goes, and I recommend it as an
example of good style (the other Perl tools referenced from the
blq project page are good examples as
well). But parts of the code are unreadable unless you are familiar
with very specific Perl idioms — the very first line of code,
$0 =~ s!.*/!!;, is an example. While all languages
have some of this kind of opacity, Perl has it worse than most.
Tcl and
Python are
both good for small scripts of this type, but both lack the Perl
convenience features for regular-expression matching that blq uses heavily; an implementation in
either would have been reasonable, but probably less
compact and
expressive. An Emacs Lisp implementation would have been even
faster to write and more compact than the Perl one, but probably
painfully slow to use.
[an error occurred while processing this directive]
|