|
|
|
|
The basic object
Step one is exactly that. C++ functions
can be placed inside structs as
“member functions.”
Here’s what it looks like after
converting
the C version of CStash to the C++ Stash:
//: C04:CppLib.h
// C-like library converted to C++
struct Stash {
int size; // Size of each space
int quantity; // Number of storage spaces
int next; // Next empty space
// Dynamically allocated array of bytes:
unsigned char* storage;
// Functions!
void initialize(int size);
void cleanup();
int add(const void* element);
void* fetch(int index);
int count();
void inflate(int increase);
}; ///:~
First, notice there is no
typedef. Instead of
requiring you to create a typedef, the C++ compiler turns the name of the
structure into a new type name for the program (just as int, char,
float and double are type names).
All the data members are exactly the same
as before, but now the functions are inside the body of the struct. In
addition, notice that the first argument from the C version of the library has
been removed. In C++, instead of forcing you to pass the
address of the structure as the first argument to all the functions that operate
on that structure, the compiler secretly does this for you. Now the only
arguments for the functions are concerned with what the function does,
not the mechanism of the function’s operation.
It’s important to realize that the
function code is effectively the same as it was with the C version of the
library. The number of arguments is the same (even though you don’t see
the structure address being passed in, it’s still there), and
there’s only one function body for each function. That is, just because
you say
Stash A, B, C;
doesn’t mean you get a different
add( ) function for each variable.
So the code that’s generated is
almost identical to what you would have written for the C version of the
library. Interestingly enough, this includes the
“name decoration” you probably would have done to produce
Stash_initialize( ), Stash_cleanup( ), and so on. When
the function name is inside the struct, the compiler effectively does the
same thing. Therefore, initialize( ) inside the structure
Stash will not collide with a function named initialize( )
inside any other structure, or even a global function named
initialize( ). Most of the time you don’t have to worry about
the function name decoration – you use the undecorated name. But sometimes
you do need to be able to specify that this initialize( ) belongs to
the struct Stash, and not to any other struct. In
particular, when you’re defining the function you need to fully specify
which one it is. To accomplish this full specification, C++ has an operator
(::) called the
scope
resolution operator (named so because names can now be in different scopes:
at global scope or within the scope of a struct). For example, if you
want to specify initialize( ), which belongs to Stash, you
say Stash::initialize(int size). You can see how the scope resolution
operator is used in the function definitions:
//: C04:CppLib.cpp {O}
// C library converted to C++
// Declare structure and functions:
#include "CppLib.h"
#include <iostream>
#include <cassert>
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;
void Stash::initialize(int sz) {
size = sz;
quantity = 0;
storage = 0;
next = 0;
}
int Stash::add(const void* element) {
if(next >= quantity) // Enough space left?
inflate(increment);
// Copy element into storage,
// starting at next empty space:
int startBytes = next * size;
unsigned char* e = (unsigned char*)element;
for(int i = 0; i < size; i++)
storage[startBytes + i] = e[i];
next++;
return(next - 1); // Index number
}
void* Stash::fetch(int index) {
// Check index boundaries:
assert(0 <= index);
if(index >= next)
return 0; // To indicate the end
// Produce pointer to desired element:
return &(storage[index * size]);
}
int Stash::count() {
return next; // Number of elements in CStash
}
void Stash::inflate(int increase) {
assert(increase > 0);
int newQuantity = quantity + increase;
int newBytes = newQuantity * size;
int oldBytes = quantity * size;
unsigned char* b = new unsigned char[newBytes];
for(int i = 0; i < oldBytes; i++)
b[i] = storage[i]; // Copy old to new
delete []storage; // Old storage
storage = b; // Point to new memory
quantity = newQuantity;
}
void Stash::cleanup() {
if(storage != 0) {
cout << "freeing storage" << endl;
delete []storage;
}
} ///:~
There are several other things that are
different between C and C++. First, the declarations in the header
files are required by the
compiler. In C++ you cannot call a function without declaring it first. The
compiler will issue an error message otherwise. This is an important way to
ensure that function calls are consistent between the point where they are
called and the point where they are defined. By forcing you to
declare the function before you
call it, the C++ compiler virtually ensures that you will perform this
declaration by including the header file. If you also include the same header
file in the place where the functions are defined, then the compiler checks to
make sure that the declaration in the header and the function definition match
up. This means that the header file becomes a validated repository for function
declarations and ensures that functions are used consistently throughout all
translation units in the project.
Of course, global functions
can still be declared by hand
every place where they are defined and used. (This is so tedious that it becomes
very unlikely.) However, structures must always be declared before they are
defined or used, and the most convenient place to put a
structure
definition is in a header file, except for those you intentionally hide in a
file.
You can see that all the member functions
look almost the same as when they were C functions, except for the scope
resolution and the fact that the first argument from the C version of the
library is no longer explicit. It’s still there, of course, because the
function has to be able to work on a particular struct variable. But
notice, inside the member function, that the member selection is also gone!
Thus, instead of saying s–>size = sz; you say size = sz;
and eliminate the tedious s–>, which didn’t really add
anything to the meaning of what you were doing anyway. The C++ compiler is
apparently doing this for you. Indeed, it is taking the “secret”
first argument (the address of the structure that we were previously passing in
by hand) and applying the member selector whenever you refer to one of the data
members of a struct.
This
means that whenever you are inside the member function of another struct,
you can refer to any member (including another member function) by simply giving
its name. The compiler will search through the local structure’s names
before looking for a global version of that name. You’ll find that this
feature means that not only is your code easier to write, it’s a lot
easier to read.
But what if, for some reason, you
want to be able to get your hands on the address of the structure? In the
C version of the library it was easy because each function’s first
argument was a CStash* called s. In C++, things are even more
consistent. There’s a special keyword, called
this, which produces the
address of the struct. It’s the equivalent of the
‘s’ in the C version of the library. So we can revert to the
C style of things by saying
this->size = Size;
The code generated by the compiler is
exactly the same, so you don’t need to use this in such a fashion;
occasionally, you’ll see code where people explicitly use this->
everywhere but it doesn’t add anything to the meaning of the code and
often indicates an inexperienced programmer. Usually, you don’t use
this often, but when you need it, it’s there (some of the examples
later in the book will use this).
There’s one last item to mention.
In C, you could assign a void* to any other pointer like
this:
int i = 10;
void* vp = &i; // OK in both C and C++
int* ip = vp; // Only acceptable in C
and
there was no complaint from the compiler. But in C++, this statement is not
allowed. Why? Because C is not so particular about type information, so it
allows you to assign a pointer with an unspecified type to a pointer with a
specified type. Not so with C++. Type is critical in C++, and the compiler
stamps its foot when there are any violations of type information. This has
always been important, but it is especially important in C++ because you have
member functions in structs. If you could pass pointers to structs
around with impunity in C++, then you could end up calling a member function for
a struct that doesn’t even logically exist for that struct!
A real recipe for disaster. Therefore, while C++ allows the assignment of any
type of pointer to a void* (this was the original
intent of void*, which is required to be large enough to hold a pointer
to any type), it will not allow you to assign a void pointer to
any other type of pointer. A cast is always required to tell the reader and the
compiler that you really do want to treat it as the destination type.
This brings up an interesting issue. One
of the important goals for C++ is to compile as much existing C code as possible
to allow for an easy transition to the new language. However, this doesn’t
mean any code that C allows will automatically be allowed in C++.
There
are a number of things the C compiler lets you get away with that are dangerous
and error-prone. (We’ll look at them as the book progresses.) The C++
compiler generates warnings and errors for these situations. This is often much
more of an advantage than a hindrance. In fact, there are many situations in
which you are trying to run down an error in C and just can’t find it, but
as soon as you recompile the program in C++, the
compiler points out the problem! In C, you’ll often find that you can get
the program to compile, but then you have to get it to work. In C++, when the
program compiles correctly, it often works, too! This is because the language is
a lot stricter about type.
You can see a number of new things in the
way the C++ version of Stash is used in the following test
program:
//: C04:CppLibTest.cpp
//{L} CppLib
// Test of C++ library
#include "CppLib.h"
#include "../require.h"
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main() {
Stash intStash;
intStash.initialize(sizeof(int));
for(int i = 0; i < 100; i++)
intStash.add(&i);
for(int j = 0; j < intStash.count(); j++)
cout << "intStash.fetch(" << j << ") = "
<< *(int*)intStash.fetch(j)
<< endl;
// Holds 80-character strings:
Stash stringStash;
const int bufsize = 80;
stringStash.initialize(sizeof(char) * bufsize);
ifstream in("CppLibTest.cpp");
assure(in, "CppLibTest.cpp");
string line;
while(getline(in, line))
stringStash.add(line.c_str());
int k = 0;
char* cp;
while((cp =(char*)stringStash.fetch(k++)) != 0)
cout << "stringStash.fetch(" << k << ") = "
<< cp << endl;
intStash.cleanup();
stringStash.cleanup();
} ///:~
One thing you’ll notice is that the
variables are all defined “on the fly” (as introduced in the
previous chapter). That is, they are defined at any point in the scope, rather
than being restricted – as in C – to the beginning of the
scope.
The code is quite similar to
CLibTest.cpp, but when a member function is called, the call occurs using
the member selection operator
‘.’ preceded
by the name of the variable. This is a convenient syntax because it mimics the
selection of a data member of the structure. The difference is that this is a
function member, so it has an argument list.
Of course, the call that the compiler
actually generates looks much more like the original C library function.
Thus, considering name
decoration
and the passing of this, the C++ function call
intStash.initialize(sizeof(int), 100) becomes something like
Stash_initialize(&intStash, sizeof(int), 100). If you ever wonder
what’s going on underneath the covers, remember that the
original
C++ compiler cfront from AT&T produced C code as its output, which
was then compiled by the underlying C compiler. This approach meant that
cfront could be quickly ported to any machine that had a C compiler, and
it helped to rapidly disseminate C++ compiler technology. But because the C++
compiler had to generate C, you know that there must be some way to represent
C++ syntax in C (some compilers still allow you to produce C
code).
There’s one other change from
ClibTest.cpp, which is the introduction of the
require.h header file. This is a header file that
I created for this book to perform more sophisticated error checking than that
provided by assert( ). It contains several functions, including the
one used here called assure( ), which is
used for files. This function checks to see if the file has successfully been
opened, and if not it reports to standard error that the file could not be
opened (thus it needs the name of the file as the second argument) and exits the
program. The require.h functions will be used throughout the book, in
particular to ensure that there are the right number of command-line arguments
and that files are opened properly. The require.h functions replace
repetitive and distracting error-checking code, and yet they provide essentially
useful error messages. These functions will be fully explained later in the
book.
|
|
|