|
Interchangeable objects with polymorphism
When dealing with type hierarchies, you
often want to treat an object not as the specific type that it is but instead as
its base type. This allows you to write code that doesn’t depend on
specific types. In the shape example, functions manipulate generic shapes
without respect to whether they’re circles, squares, triangles, and so on.
All shapes can be drawn, erased, and moved, so these functions simply send a
message to a shape object; they don’t worry about how the object copes
with the message.
Such code is unaffected by the addition
of new types, and adding new types is the most common way to extend an
object-oriented program to handle new situations. For example, you can derive a
new subtype of shape called pentagon without modifying the functions that
deal only with generic shapes. This ability to extend a program easily by
deriving new subtypes is important because it greatly improves designs while
reducing the cost of software maintenance.
There’s a problem, however, with
attempting to treat derived-type objects as their generic base types (circles as
shapes, bicycles as vehicles, cormorants as birds, etc.). If a function is going
to tell a generic shape to draw itself, or a generic vehicle to steer, or a
generic bird to move, the compiler cannot know at compile-time precisely what
piece of code will be executed. That’s the whole point – when the
message is sent, the programmer doesn’t want to know what piece of
code will be executed; the draw function can be applied equally to a circle, a
square, or a triangle, and the object will execute the proper code depending on
its specific type. If you don’t have to know what piece of code will be
executed, then when you add a new subtype, the code it executes can be different
without requiring changes to the function call. Therefore, the compiler cannot
know precisely what piece of code is executed, so what does it do? For example,
in the following diagram the BirdController object just works with
generic Bird objects, and does not know what exact type they are. This is
convenient from BirdController’s perspective, because it
doesn’t have to write special code to determine the exact type of
Bird it’s working with, or that Bird’s behavior. So
how does it happen that, when move( ) is called while ignoring the
specific type of Bird, the right behavior will occur (a Goose
runs, flies, or swims, and a Penguin runs or swims)?
The answer is the primary twist in
object-oriented programming: The compiler cannot make a function call in the
traditional sense. The function call generated by a non-OOP compiler causes what
is called early binding, a
term you may not have heard before because you’ve never thought about it
any other way. It means the compiler generates a call to a specific function
name, and the linker resolves this call to the absolute address of the code to
be executed. In OOP, the program cannot determine the address of the code until
runtime, so some other scheme is necessary when a message is sent to a generic
object.
To solve the problem, object-oriented
languages use the concept of
late binding. When you send
a message to an object, the code being called isn’t determined until
runtime. The compiler does ensure that the function exists and performs type
checking on the arguments and return value (a language in which this isn’t
true is called weakly
typed), but it doesn’t know the exact code to
execute.
To perform late binding, the C++ compiler
inserts a special bit of code in lieu of the absolute call. This code calculates
the address of the function body, using information stored in the object (this
process is covered in great detail in Chapter 15). Thus, each object can behave
differently according to the contents of that special bit of code. When you send
a message to an object, the object actually does figure out what to do with that
message.
You state that you want a function to
have the flexibility of late-binding properties using the
keyword virtual. You
don’t need to understand the mechanics of virtual to use it, but
without it you can’t do object-oriented programming in C++. In C++, you
must remember to add the virtual keyword because, by default, member
functions are not dynamically bound. Virtual functions allow you to
express the differences in behavior of classes in the same family. Those
differences are what cause polymorphic behavior.
Consider the shape example. The family of
classes (all based on the same uniform interface) was diagrammed earlier in the
chapter. To demonstrate polymorphism, we want to write a single piece of code
that ignores the specific details of type and talks only to the base class. That
code is decoupled from type-specific information,
and thus is simpler to write and easier to understand. And, if a new type
– a Hexagon, for example – is added through
inheritance, the code you write will work just as well for the new type of
Shape as it did on the existing types. Thus, the program is
extensible.
If you write a function in C++ (as you
will soon learn how to do):
void doStuff(Shape& s) {
s.erase();
// ...
s.draw();
}
This function speaks to any Shape,
so it is independent of the specific type of object that it’s drawing and
erasing (the ‘&’ means “Take the address of the
object that’s passed to doStuff( ),” but it’s not
important that you understand the details of that right now). If in some other
part of the program we use the doStuff( ) function:
Circle c;
Triangle t;
Line l;
doStuff(c);
doStuff(t);
doStuff(l);
The calls to doStuff( )
automatically work right, regardless of the exact type of the object.
This is actually a pretty amazing trick.
Consider the line:
doStuff(c);
What’s happening here is that a
Circle is being passed into a function that’s expecting a
Shape. Since a Circle is a Shape it can be treated
as one by doStuff( ). That is, any message that
doStuff( ) can send to a Shape, a Circle can accept.
So it is a completely safe and logical thing to do.
We call this process of treating a
derived type as though it were its base type
upcasting. The name cast
is used in the sense of casting into a mold and the up comes from the
way the inheritance diagram is
typically arranged, with the base type at the top and the derived classes
fanning out downward. Thus, casting to a base type is moving up the inheritance
diagram: “upcasting.”
An object-oriented program contains some
upcasting somewhere, because that’s how you decouple yourself from knowing
about the exact type you’re working with. Look at the code in
doStuff( ):
s.erase();
// ...
s.draw();
Notice that it doesn’t say
“If you’re a Circle, do this, if you’re a
Square, do that, etc.” If you write that kind of code, which checks
for all the possible types that a Shape can actually be, it’s messy
and you need to change it every time you add a new kind of Shape. Here,
you just say “You’re a shape, I know you can erase( )
and draw( ) yourself, do it, and take care of the details
correctly.”
What’s impressive about the code in
doStuff( ) is that, somehow, the right thing happens. Calling
draw( ) for Circle causes different code to be executed than
when calling draw( ) for a Square or a Line, but when
the draw( ) message is sent to an anonymous Shape, the
correct behavior occurs based on the actual type of the Shape. This is
amazing because, as mentioned earlier, when the C++ compiler is compiling the
code for doStuff( ), it cannot know exactly what types it is dealing
with. So ordinarily, you’d expect it to end up calling the version of
erase( ) and draw( ) for Shape, and not for the
specific Circle, Square, or Line. And yet the right thing
happens because of polymorphism. The compiler and runtime system handle the
details; all you need to know is that it happens and more importantly how to
design with it. If a member function is virtual, then when you
send a message to an object, the object will do the right thing, even when
upcasting is involved.
|
|