Contents
13. Parsing
See the
Perl Reference Guide section 14,
Search and replace functions. When you include parenthesis ( ) in a matched
string, the matching text in the parenthesis may subsequently be referenced via
variables $1, $2, $3, ... for each left parenthesis encountered. These matches
can also be assigned as sequential values of an array.
#!/usr/local/bin/perl -w
$s = 'There is 1 date 10/25/95 in here somewhere.';
print "\$s=$s\n";
$s =~ /(\d{1,2})\/(\d{1,2})\/(\d{2,4})/;
print "Trick 1: \$1=$1, \$2=$2, \$3=$3,\n",
" \$\`=",$`," \$\'=",$',"\n";
($mo, $day, $year) =
( $s =~ /(\d{1,2})\/(\d{1,2})\/(\d{2,4})/ );
print "Trick 2: \$mo=$mo, \$day=$day, \$year=$year.\n";
($wholedate,$mo, $day, $year) =
( $s =~ /((\d{1,2})\/(\d{1,2})\/(\d{2,4}))/ );
print "Trick 3: \$wholedate=$wholedate, \$mo=$mo, ",
"\$day=$day, \$year=$year.\n";
Results of above:
$s=There is 1 date 10/25/95 in here somewhere.
Trick 1: $1=10, $2=25, $3=95,
$`=There is 1 date $'= in here somewhere.
Trick 2: $mo=10, $day=25, $year=95.
Trick 3: $wholedate=10/25/95, $mo=10, $day=25, $year=95.
Note that when
patterns are matched in an array context as in Tricks 2 and 3, $1, $2, ..., and
$`, $', and $& are not set.
Regular expressions are greedy. In the following example we try to
match whatever is between "<" and ">" :
#!/usr/local/bin/perl -w
$s = 'Beware of <STRONG>greedy</strong> regular expressions.';
print "\$s=$s\n";
($m) = ( $s =~ /<(.*)>/ );
print "Try 1: \$m=$m\n";
($m) = ( $s =~ /<([^>]*)>/ );
print "Try 2: \$m=$m\n";
This results in: $s=Beware of <STRONG>greedy</strong> regular expressions.
Try 1: $m=STRONG>greedy</strong
Try 2: $m=STRONG
Homework: Parsing and Reporting
1. See preceding "Grade Book" example. Using the same "stufile" input, print
a list of students ordered by family name, with any quoted nickname listed in
place of the given name, and family name last. Produce output like this:
Student-ID Year Name
357913 JR Thomas Jefferson
246802 SO Abe Lincoln
212121 SO Teddy Roosevelt
123456 SR George Washington