|
|
Appendix B |
|
One of the most powerful features of Perl is its regular expression
handling. Regular expressions are especially useful for CGI
programming, as text manipulation is central to so many CGI
applications. In this appendix, we include a quick reference to
regular expressions in Perl. For more information on Perl, see the
Nutshell Handbooks Learning Perl by Randal
L. Schwartz, Programming Perl by Larry Wall and
Randal L. Schwartz, and Perl 5 Desktop
Reference by Johan Vromans, all published by O'Reilly
& Associates, Inc.
- /abc/
-
Matches abc anywhere within the string
- /^abc/
-
Matches abc at the beginning of the string
- /abc$/
-
Matches abc at the end of the string
- /a|b/
-
Matches either a or b
Can also be used with words (i.e., /perl|tcl/)
- /ab{2,4}c/
-
Matches an a followed by 2-4 b's, followed by c.
If the second number is omitted, such as /ab
{2,}c/, the expression will
match two or more b's.
- /ab*c/
-
Matches an a followed by zero or more b's, followed by c.
Expressions are greedy--it will match as many as possible. Same as
/ab{0,}c/.
- /ab+c/
-
Matches an a followed by one or more b's followed by c.
Same as /ab{1,}c/.
- /ab?c/
-
Matches an a followed by an optional
b followed by c
Same as /ab{0,1}c/.
This has a different meaning in Perl 5. In Perl 5, the expression:
/ab*?c/matches an a followed by as few b's as possible (non-greedy).
- /./
-
Matches any single character except a newline (\n)
/p..l
/ matches a p followed by any two characters,
followed by l, so it will match such strings as perl, pall, pdgl,
p3gl, etc.
- /[abc]/
-
A character class--matches any one of the three characters listed.
A pattern of /[abc]+/ matches strings such as
abcab, acbc, abbac, aaa, abcacbac, ccc, etc.
- /\d/
-
Matches a digit. Same as /[0-9]/Multipliers can be used (/\d+/
matches one or more digits)
- /\w/
-
Matches a character classified as a word.
Same as /[a-zA-Z0-9_]/
- /\s/
-
Matches a character classified as whitespace.
Same as /[ \r\t\n\f]/
- /\b/
-
Matches a word boundary or a backspace/test\b/
matches test, but not testing.
However, \b matches a backspace character inside
a class (i.e., [\b])
- /[^abc]/
-
Matches a character that is not in the class/[^abc
]+/ will match such strings as hello, test, perl,
etc.
- /\D/
-
Matches a character that is not a digit.
Same as /[^0-9]/
- /\W/
-
Matches a character that is not a word.
Same as /[^a-zA-Z0-9_]/
- /\S/
-
Matches a character that is not whitespace.
Same as /[^ \r\t\n\f]/
- /\B/
-
Requires that there is no word boundary/hello\B/
matches hello, but not hello there
- /\*/
-
Matches the * character. Use the \ character
to escape characters that have significance in a regular expression.
- /(abc)/
-
Matches abc anywhere within the string, but the parentheses act as memory, storing abc in the variable $1.
- Example 1:
-
/name=(.*)/ will store zero or
more characters after name= in variable $1.
- Example 2:
-
/name=(.*)&user=\1/ will
store zero or more characters after name= in
$1. Then, Perl will replace \1
with the value in $1, and check to see if the
pattern matches.
- Example 3:
-
/name=([^&]*)/ will store
zero or more characters after name= but before
the & character in variable $1.
- Example 4:
-
/name=([^&]+)&age=(.*)$/
will store one or more characters after name=
but before & in $1. It
then matches the & character. All characters
after age= but before the end of the line are
stored in $2.
- /abc/i
-
Ignores case. Matches either abc, Abc, ABC, aBc, aBC, etc.
|
|