|
|
|
|
A tiny C-like library
A library usually starts out as a
collection of functions, but if you have used third-party
C libraries you know there’s
usually more to it than that because there’s more to life than
behavior, actions, and functions. There are also
characteristics (blue, pounds, texture, luminance), which
are represented by data. And when you start to deal with a set of
characteristics in C, it is very convenient to clump them together into a
struct, especially if you want to represent more
than one similar thing in your problem space. Then you can make a variable of
this struct for each thing.
Thus, most C libraries have a set of
structs and a set of functions that act on those structs. As an
example of what such a system looks like, consider a programming tool that acts
like an array, but whose size can be established at runtime, when it is created.
I’ll call it a CStash. Although it’s written in C++, it has
the style of what you’d write in C:
//: C04:CLib.h
// Header file for a C-like library
// An array-like entity created at runtime
typedef struct CStashTag {
int size; // Size of each space
int quantity; // Number of storage spaces
int next; // Next empty space
// Dynamically allocated array of bytes:
unsigned char* storage;
} CStash;
void initialize(CStash* s, int size);
void cleanup(CStash* s);
int add(CStash* s, const void* element);
void* fetch(CStash* s, int index);
int count(CStash* s);
void inflate(CStash* s, int increase);
///:~
A tag name like CStashTag is
generally used for a struct in case you need to
reference the struct inside itself. For example, when creating a
linked list (each element in your list contains a pointer to the next
element), you need a pointer to the next struct variable, so you need a
way to identify the type of that pointer within the struct body. Also,
you'll almost universally see the typedef as shown
above for every struct in a C library. This is done so you can treat the
struct as if it were a new type and define variables of that
struct like this:
CStash A, B, C;
The storage pointer is an
unsigned char*. An unsigned char is the smallest piece of storage
a C compiler supports, although on
some machines it can be the same size as the largest. It’s implementation
dependent, but is often one byte long. You might think that because the
CStash is designed to hold any type of variable, a
void* would be more
appropriate here. However, the purpose is not to treat this storage as a block
of some unknown type, but rather as a block of contiguous
bytes.
The source code for the implementation
file (which you may not get if you buy a library commercially – you might
get only a compiled obj or lib or dll, etc.) looks like
this:
//: C04:CLib.cpp {O}
// Implementation of example C-like library
// Declare structure and functions:
#include "CLib.h"
#include <iostream>
#include <cassert>
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;
void initialize(CStash* s, int sz) {
s->size = sz;
s->quantity = 0;
s->storage = 0;
s->next = 0;
}
int add(CStash* s, const void* element) {
if(s->next >= s->quantity) //Enough space left?
inflate(s, increment);
// Copy element into storage,
// starting at next empty space:
int startBytes = s->next * s->size;
unsigned char* e = (unsigned char*)element;
for(int i = 0; i < s->size; i++)
s->storage[startBytes + i] = e[i];
s->next++;
return(s->next - 1); // Index number
}
void* fetch(CStash* s, int index) {
// Check index boundaries:
assert(0 <= index);
if(index >= s->next)
return 0; // To indicate the end
// Produce pointer to desired element:
return &(s->storage[index * s->size]);
}
int count(CStash* s) {
return s->next; // Elements in CStash
}
void inflate(CStash* s, int increase) {
assert(increase > 0);
int newQuantity = s->quantity + increase;
int newBytes = newQuantity * s->size;
int oldBytes = s->quantity * s->size;
unsigned char* b = new unsigned char[newBytes];
for(int i = 0; i < oldBytes; i++)
b[i] = s->storage[i]; // Copy old to new
delete [](s->storage); // Old storage
s->storage = b; // Point to new memory
s->quantity = newQuantity;
}
void cleanup(CStash* s) {
if(s->storage != 0) {
cout << "freeing storage" << endl;
delete []s->storage;
}
} ///:~
initialize( ) performs the
necessary setup for struct CStash by setting the internal variables to
appropriate values. Initially, the storage pointer is set to zero –
no initial storage is allocated.
The add( ) function inserts
an element into the CStash at the next available location. First, it
checks to see if there is any available space left. If not, it expands the
storage using the inflate( ) function, described
later.
Because the compiler doesn’t know
the specific type of the variable being stored (all the function gets is a
void*), you can’t just do an assignment, which would certainly be
the convenient thing. Instead, you must copy the variable byte-by-byte. The most
straightforward way to perform the copying is with array indexing. Typically,
there are already data bytes in storage, and this is indicated by the
value of next. To start with the right byte offset, next is
multiplied by the size of each element (in bytes) to produce startBytes.
Then the argument element is cast to an unsigned char* so that it
can be addressed byte-by-byte and copied into the available storage
space. next is incremented so that it indicates the next available piece
of storage, and the “index number” where the value was stored so
that value can be retrieved using this index number with
fetch( ).
fetch( ) checks to see that
the index isn’t out of bounds and then returns the address of the desired
variable, calculated using the index argument. Since index
indicates the number of elements to offset into the CStash, it must
be multiplied by the number of bytes occupied by each piece to produce the
numerical offset in bytes. When this offset is used to index into storage
using array indexing, you don’t get the address, but instead the byte
at the address. To produce the address, you must use the address-of operator
&.
count( ) may look a bit
strange at first to a seasoned C programmer. It seems like a lot of trouble to
go through to do something that would probably be a lot easier to do by hand. If
you have a struct CStash called intStash, for example, it would
seem much more straightforward to find out how many elements it has by saying
intStash.next instead of making a function call (which has overhead),
such as count(&intStash). However, if you wanted to change the
internal representation of CStash and thus the way the count was
calculated, the function call interface allows the necessary flexibility. But
alas, most programmers won’t bother to find out about your
“better” design for the library. They’ll look at the
struct and grab the next value directly, and possibly even change
next without your permission. If only there were some way for the library
designer to have better control over things like this! (Yes, that’s
foreshadowing.)
|
|
|