|
-
regular characters
- All characters except ., |, (, ), [, \, ^, {, +, $, *, and ? match
themselves. To match one of these characters, precede it with a
backslash.
-
^
- Matches the beginning of a line.
-
$
- Matches the end of a line.
-
\A
- Matches the beginning of the string.
-
\z
- Matches the end of the string.
-
\Z
- Matches the end of the string unless the string
ends with a ``\n'', in
which case it matches just before the ``\n''.
-
\b , \B
- Match word boundaries and nonword boundaries respectively.
-
[
characters
]
- A character class matches any single character between the
brackets. The characters
|, (, ), [, ^, $, *, and ? ,
which have special meanings elsewhere in patterns, lose their
special significance between brackets. The sequences
\
nnn, \x
nn, \c
x, \C-
x, \M-
x, and \M-\C-
x
have the meanings shown in Table 18.2 on page 203. The
sequences \d , \D , \s , \S , \w , and \W are abbreviations for groups of characters, as
shown in Table 5.1 on page 59. The sequence c1-c2
represents all the characters between c1 and c2, inclusive.
Literal ] or - characters must appear immediately after
the opening bracket. An uparrow (^) immediately following the
opening bracket negates the sense of the match---the pattern matches
any character that isn't in the character class.
-
\d , \s , \w
- Are abbreviations for character classes that match digits, whitespace,
and word characters, respectively. \D, \S, and \W match
characters that are not digits, whitespace, or word
characters. These abbreviations are summarized in Table
5.1 on page 59.
-
. (period)
- Appearing outside brackets, matches any character except a newline.
(With the
/m option, it matches newline, too).
-
re
*
- Matches zero or more occurrences of re.
-
re
+
- Matches one or more occurrences of re.
-
re
{m,n}
- Matches at least ``m'' and at most ``n'' occurrences of re.
-
re
?
- Matches zero or one occurrence of re.
The
* , + , and {m,n} modifiers are greedy by
default. Append a question mark to make them minimal.
-
re1
|
re2
- Matches either re1 or re2.
| has a low
precedence.
-
(...)
- Parentheses are used to group regular expressions. For example, the
pattern
/abc+/ matches a string containing an ``a,'' a ``b,''
and one or more ``c''s. /(abc)+/ matches one or more sequences
of ``abc''. Parentheses are also used to collect the results of
pattern matching. For each opening parenthesis, Ruby stores the
result of the partial match between it and the corresponding closing
parenthesis as successive groups. Within the same pattern,
\1 refers to the match of the first group, \2 the
second group, and so on. Outside the pattern, the special variables
$1 , $2 , and so on, serve the same purpose.
|
|