Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Thinking in Java
Prev Contents / Index Next

Quantifiers

A quantifier describes the way that a pattern absorbs input text:

  • Greedy: Quantifiers are greedy unless otherwise altered. A greedy expression finds as many possible matches for the pattern as possible. A typical cause of problems is to assume that your pattern will only match the first possible group of characters, when it’s actually greedy and will keep going.
  • Reluctant: Specified with a question mark, this quantifier matches the minimum necessary number of characters to satisfy the pattern. Also called lazy, minimal matching, non-greedy, or ungreedy.
  • Possessive: Currently only available in Java (not in other languages), and it is more advanced, so you probably won’t use it right away. As a regular expression is applied to a string, it generates many states so that it can backtrack if the match fails. Possessive quantifiers do not keep those intermediate states, and thus prevent backtracking. They can be used to prevent a regular expression from running away and also to make it execute more efficiently.

Greedy

Reluctant

Possessive

Matches

X?

X??

X?+

X, one or none

X*

X*?

X*+

X, zero or more

X+

X+?

X++

X, one or more

X{n}

X{n}?

X{n}+

X, exactly n times

X{n,}

X{n,}?

X{n,}+

X, at least n times

X{n,m}

X{n,m}?

X{n,m}+

X, at least n but not more than m times

You should be very aware that the expression ‘X’ will often need to be surrounded in parentheses for it to work the way you desire. For example:

abc+


Might seem like it would match the sequence ‘abc’ one or more times, and if you apply it to the input string ‘abcabcabc’, you will in fact get three matches. However, the expression actually says “match ‘ab’ followed by one or more occurrences of ‘c’.” To match the entire string ‘abc’ one or more times, you must say:

(abc)+


You can easily be fooled when using regular expressions; it’s a new language, on top of Java.

CharSequence

JDK 1.4 defines a new interface called CharSequence, which establishes a definition of a character sequence abstracted from the String or StringBuffer classes:

interface CharSequence {
  charAt(int i);
  length();
  subSequence(int start, int end);
  toString();
}


The String, StringBuffer, and CharBuffer classes have been modified to implement this new CharSequence interface. Many regular expression operations take CharSequence arguments.
Thinking in Java
Prev Contents / Index Next

 
 
   Reproduced courtesy of Bruce Eckel, MindView, Inc. Design by Interspire