Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |
Comparing strings is inherently different from comparing
numbers. Numbers have constant, universally meaningful values. To evaluate the
relationship between the magnitudes of two strings, you must make a lexical
comparison. Lexical comparison means that when you test a character to see
if it is greater than or less than another character, you are actually
comparing the numeric representation of those characters as specified in the
collating sequence of the character set being used. Most often this will be the
ASCII collating sequence, which assigns the printable characters for the
English language numbers in the range 32 through 127 decimal. In the ASCII
collating sequence, the first character in the list is the space, followed by
several common punctuation marks, and then uppercase and lowercase letters.
With respect to the alphabet, this means that the letters nearer the front have
lower ASCII values than those nearer the end. With these details in mind, it
becomes easier to remember that when a lexical comparison that reports s1
is greater than s2, it simply means that when the two were compared,
the first differing character in s1 came later in the alphabet than the
character in that same position in s2.
C++ provides several ways to compare strings, and each has
advantages. The simplest to use are the nonmember, overloaded operator
functions: operator ==, operator != operator >, operator
<, operator >=, and operator <=.
//: C03:CompStr.h
#ifndef COMPSTR_H
#define COMPSTR_H
#include <string>
#include "../TestSuite/Test.h"
using std::string;
class CompStrTest : public TestSuite::Test {
public:
void run() {
// Strings to compare
string s1("This");
string s2("That");
test_(s1 == s1);
test_(s1 != s2);
test_(s1 > s2);
test_(s1 >= s2);
test_(s1 >= s1);
test_(s2 < s1);
test_(s2 <= s1);
test_(s1 <= s1);
}
};
#endif // COMPSTR_H ///:~
//: C03:CompStr.cpp
//{L} ../TestSuite/Test
#include "CompStr.h"
int main() {
CompStrTest t;
t.run();
return t.report();
} ///:~
The overloaded comparison operators are useful for comparing
both full strings and individual string character elements.
Notice in the following example the flexibility of argument
types on both the left and right side of the comparison operators. For
efficiency, the string class provides overloaded operators for the
direct comparison of string objects, quoted literals, and pointers to C-style
strings without having to create temporary string objects.
//: C03:Equivalence.cpp
#include <iostream>
#include <string>
using namespace std;
int main() {
string s2("That"), s1("This");
// The lvalue is a quoted literal
// and the rvalue is a string:
if("That" == s2)
cout << "A match" << endl;
// The left operand is a string and the right is
// a pointer to a C-style null terminated string:
if(s1 != s2.c_str())
cout << "No match" << endl;
} ///:~
The c_str( ) function returns a const char*
that points to a C-style, null-terminated string equivalent to the contents of
the string object. This comes in handy when you want to pass a string to
a standard C function, such as atoi( ) or any of the functions
defined in the <cstring> header. It is an error to use the value
returned by c_str( ) as non-const argument to any function.
You won t find the logical not (!) or the logical
comparison operators (&& and ||) among operators for a
string. (Neither will you find overloaded versions of the bitwise C operators &,
|, ^, or ~.) The overloaded nonmember comparison operators
for the string class are limited to the subset that has clear, unambiguous
application to single characters or groups of characters.
The compare( ) member function offers you a
great deal more sophisticated and precise comparison than the nonmember
operator set. It provides overloaded versions to compare:
Two complete strings.
Part of either string to a complete string.
Subsets of two strings.
The following example compares complete strings:
//: C03:Compare.cpp
// Demonstrates compare() and swap().
#include <cassert>
#include <string>
using namespace std;
int main() {
string first("This");
string second("That");
assert(first.compare(first) == 0);
assert(second.compare(second) == 0);
// Which is lexically greater?
assert(first.compare(second) > 0);
assert(second.compare(first) < 0);
first.swap(second);
assert(first.compare(second) < 0);
assert(second.compare(first) > 0);
} ///:~
The swap( ) function in this example does what
its name implies: it exchanges the contents of its object and argument. To
compare a subset of the characters in one or both strings, you add arguments
that define where to start the comparison and how many characters to consider.
For example, we can use the following overloaded version of compare( ):
s1.compare(s1StartPos, s1NumberChars, s2, s2StartPos,
s2NumberChars);
Here s an example:
//: C03:Compare2.cpp
// Illustrate overloaded compare().
#include <cassert>
#include <string>
using namespace std;
int main() {
string first("This is a day that will live in
infamy");
string second("I don't believe that this is what
"
"I signed up for");
// Compare "his is" in both strings:
assert(first.compare(1, 7, second, 22, 7) == 0);
// Compare "his is a" to "his is w":
assert(first.compare(1, 9, second, 22, 9) < 0);
} ///:~
In the examples so far, we have used C-style array indexing
syntax to refer to an individual character in a string. C++ strings provide an
alternative to the s[n] notation: the at( ) member. These two indexing mechanisms produce the same result in C++ if all goes well:
//: C03:StringIndexing.cpp
#include <cassert>
#include <string>
using namespace std;
int main() {
string s("1234");
assert(s[1] == '2');
assert(s.at(1) == '2');
} ///:~
There is one important difference, however, between [ ]
and at( ). When you try to reference an array element that is out
of bounds, at( ) will do you the kindness of throwing an exception,
while ordinary [ ] subscripting syntax will leave you to your own
devices:
//: C03:BadStringIndexing.cpp
#include <exception>
#include <iostream>
#include <string>
using namespace std;
int main() {
string s("1234");
// at() saves you by throwing an exception:
try {
s.at(5);
} catch(exception& e) {
cerr << e.what() << endl;
}
} ///:~
Responsible programmers will not use errant indexes, but
should you want to benefits of automatic index checking, using at( ) in
place of [ ] will give you a chance to gracefully recover from
references to array elements that don t exist. Execution of this program on one
of our test compilers gave the following output:
The at( ) member throws an object of class out_of_range,
which derives (ultimately) from std::exception. By catching this object
in an exception handler, you can take appropriate remedial actions such as
recalculating the offending subscript or growing the array. Using string::operator[ ]( )
gives no such protection and is as dangerous as char array processing in
C.
Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |