Linxutopia - Thinking in C++ - 4: Data Abstraction

Thinking in C++
Prev	Contents / Index	Next

A tiny C-like library

C# Essentials
eBook

$9.99

eBookFrenzy.com

A library usually starts out as a collection of functions, but if you have used third-party C libraries you know there’s usually more to it than that because there’s more to life than behavior, actions, and functions. There are also characteristics (blue, pounds, texture, luminance), which are represented by data. And when you start to deal with a set of characteristics in C, it is very convenient to clump them together into a struct, especially if you want to represent more than one similar thing in your problem space. Then you can make a variable of this struct for each thing.

Thus, most C libraries have a set of structs and a set of functions that act on those structs. As an example of what such a system looks like, consider a programming tool that acts like an array, but whose size can be established at runtime, when it is created. I’ll call it a CStash. Although it’s written in C++, it has the style of what you’d write in C:

//: C04:CLib.h
// Header file for a C-like library
// An array-like entity created at runtime

typedef struct CStashTag {
  int size;      // Size of each space
  int quantity;  // Number of storage spaces
  int next;      // Next empty space
  // Dynamically allocated array of bytes:
  unsigned char* storage;
} CStash;

void initialize(CStash* s, int size);
void cleanup(CStash* s);
int add(CStash* s, const void* element);
void* fetch(CStash* s, int index);
int count(CStash* s);
void inflate(CStash* s, int increase);
///:~

A tag name like CStashTag is generally used for a struct in case you need to reference the struct inside itself. For example, when creating a linked list (each element in your list contains a pointer to the next element), you need a pointer to the next struct variable, so you need a way to identify the type of that pointer within the struct body. Also, you'll almost universally see the typedef as shown above for every struct in a C library. This is done so you can treat the struct as if it were a new type and define variables of that struct like this:

CStash A, B, C;

The storage pointer is an unsigned char*. An unsigned char is the smallest piece of storage a C compiler supports, although on some machines it can be the same size as the largest. It’s implementation dependent, but is often one byte long. You might think that because the CStash is designed to hold any type of variable, a void* would be more appropriate here. However, the purpose is not to treat this storage as a block of some unknown type, but rather as a block of contiguous bytes.

The source code for the implementation file (which you may not get if you buy a library commercially – you might get only a compiled obj or lib or dll, etc.) looks like this:

//: C04:CLib.cpp {O}
// Implementation of example C-like library
// Declare structure and functions:
#include "CLib.h"
#include <iostream>
#include <cassert> 
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;

void initialize(CStash* s, int sz) {
  s->size = sz;
  s->quantity = 0;
  s->storage = 0;
  s->next = 0;
}

int add(CStash* s, const void* element) {
  if(s->next >= s->quantity) //Enough space left?
    inflate(s, increment);
  // Copy element into storage,
  // starting at next empty space:
  int startBytes = s->next * s->size;
  unsigned char* e = (unsigned char*)element;
  for(int i = 0; i < s->size; i++)
    s->storage[startBytes + i] = e[i];
  s->next++;
  return(s->next - 1); // Index number
}

void* fetch(CStash* s, int index) {
  // Check index boundaries:
  assert(0 <= index);
  if(index >= s->next)
    return 0; // To indicate the end
  // Produce pointer to desired element:
  return &(s->storage[index * s->size]);
}

int count(CStash* s) {
  return s->next;  // Elements in CStash
}

void inflate(CStash* s, int increase) {
  assert(increase > 0);
  int newQuantity = s->quantity + increase;
  int newBytes = newQuantity * s->size;
  int oldBytes = s->quantity * s->size;
  unsigned char* b = new unsigned char[newBytes];
  for(int i = 0; i < oldBytes; i++)
    b[i] = s->storage[i]; // Copy old to new
  delete [](s->storage); // Old storage
  s->storage = b; // Point to new memory
  s->quantity = newQuantity;
}

void cleanup(CStash* s) {
  if(s->storage != 0) {
   cout << "freeing storage" << endl;
   delete []s->storage;
  }
} ///:~

initialize( ) performs the necessary setup for struct CStash by setting the internal variables to appropriate values. Initially, the storage pointer is set to zero – no initial storage is allocated.

The add( ) function inserts an element into the CStash at the next available location. First, it checks to see if there is any available space left. If not, it expands the storage using the inflate( ) function, described later.

Because the compiler doesn’t know the specific type of the variable being stored (all the function gets is a void*), you can’t just do an assignment, which would certainly be the convenient thing. Instead, you must copy the variable byte-by-byte. The most straightforward way to perform the copying is with array indexing. Typically, there are already data bytes in storage, and this is indicated by the value of next. To start with the right byte offset, next is multiplied by the size of each element (in bytes) to produce startBytes. Then the argument element is cast to an unsigned char* so that it can be addressed byte-by-byte and copied into the available storage space. next is incremented so that it indicates the next available piece of storage, and the “index number” where the value was stored so that value can be retrieved using this index number with fetch( ).

fetch( ) checks to see that the index isn’t out of bounds and then returns the address of the desired variable, calculated using the index argument. Since index indicates the number of elements to offset into the CStash, it must be multiplied by the number of bytes occupied by each piece to produce the numerical offset in bytes. When this offset is used to index into storage using array indexing, you don’t get the address, but instead the byte at the address. To produce the address, you must use the address-of operator &.

count( ) may look a bit strange at first to a seasoned C programmer. It seems like a lot of trouble to go through to do something that would probably be a lot easier to do by hand. If you have a struct CStash called intStash, for example, it would seem much more straightforward to find out how many elements it has by saying intStash.next instead of making a function call (which has overhead), such as count(&intStash). However, if you wanted to change the internal representation of CStash and thus the way the count was calculated, the function call interface allows the necessary flexibility. But alas, most programmers won’t bother to find out about your “better” design for the library. They’ll look at the struct and grab the next value directly, and possibly even change next without your permission. If only there were some way for the library designer to have better control over things like this! (Yes, that’s foreshadowing.)

Thinking in C++
Prev	Contents / Index	Next