my $pattern = '^\d+$'; # likely to be input from an HTML form field
foreach (@list) {
print if /$pattern/o;
}
This is usually a big win in loops over lists, or when using the
grep( ) or map( ) operators.
In long-lived mod_perl scripts and handlers, however, the variable
may change with each invocation. In that case, this memorization can
pose a problem. The first request processed by a fresh mod_perl child
process will compile the regex and perform the search correctly.
However, all subsequent requests running the same code in the same
process will use the memorized pattern and not the fresh one supplied
by users. The code will appear to be broken.
Imagine that you run a search engine service, and one person enters a
search keyword of her choice and finds what she's
looking for. Then another person who happens to be served by the same
process searches for a different keyword, but unexpectedly receives
the same search results as the previous person.
There are two solutions to this problem.
The original code fragment would be rewritten as:
my $pattern = '^\d+$';
eval q{
foreach (@list) {
print if /$pattern/o;
}
}
If we were to write this:
foreach (@list) {
eval q{ print if /$pattern/o; };
}
the regex would be compiled for every element in the list, instead of
just once for the entire loop over the list (and the
/o modifier would essentially be useless).
However, watch out for using strings coming from an untrusted origin
inside eval—they might contain Perl code
dangerous to your system, so make sure to sanity-check them first.
This approach can be used if there is more than one pattern-match
operator in a given section of code. If the section contains only one
regex operator (be it m// or
s///), you can rely on the property of the
null pattern, which reuses the last pattern
seen. This leads to the second solution, which also eliminates the
use of eval.
The above code fragment becomes:
my $pattern = '^\d+$';
"0" =~ /$pattern/; # dummy match that must not fail!
foreach (@list) {
print if //;
}
The only caveat is that the dummy match that boots the regular
expression engine mustsucceed—otherwise
the pattern will not be cached, and the // will
match everything. If you can't count on fixed text
to ensure the match succeeds, you have two options.
If you can guarantee that the pattern variable contains no
metacharacters (such as *, +,
^, $, \d,
etc.), you can use the dummy match of the pattern itself:
$pattern =~ /\Q$pattern\E/; # guaranteed if no metacharacters present
The \Q modifier ensures that any special regex
characters will be escaped.
If there is a possibility that the pattern contains metacharacters,
you should match the pattern itself, or the nonsearchable
\377 character, as follows:
"\377" =~ /$pattern|^\377$/; # guaranteed if metacharacters present