GHashTable is a simple
hash table implementation, providing an associative array
with constant-time lookups. To use the hash table, you
must provide a GHashFunc,
which should return a positive integer when passed a hash
key:
typedef guint (*GHashFunc) (gconstpointer key);
|
Each returned guint
(modulus the size of the table) corresponds to a "slot"
or "bucket" in the hash;
GHashTable handles collisions by storing a linked
list of key-value pairs in each slot. Thus, the guint values returned by your
GHashFunc must be fairly
evenly distributed over the set of possible guint values, or the hash table will
degenerate into a linked list. Your GHashFunc must also be fast, since it
is used for every lookup.
In addition to GHashFunc,
a GCompareFunc is
required to test keys for equality. Somewhat
unpleasantly, GHashTable
does not use GCompareFunc
in the same way GSList
and GTree do, although
the function signature is the same. Here GCompareFunc is expected to be an
equality operator, returning
TRUE if its arguments are equal. It should not be a
qsort()-style comparison function. The key
comparison function is used to find the correct key-value
pair when hash collisions result in more than one pair in
the same hash slot.
To create and destroy a
GHashTable, use the constructor and destructor
listed in Figure 29.
Remember that glib has no way of knowing how to destroy
the data contained in your hash table; it only destroys
the table itself.
Ready-to-use hash and comparison functions are provided
for the most common keys: integers, pointers, and
strings. These are listed in Figure 30. The functions for
integers accept a pointer to a
gint, rather than the
gint itself. If you pass
NULL as the hash function argument to g_hash_table_new(),
g_direct_hash() is used by default. If you pass
NULL as the key equality
function, then simple pointer comparison is used
(equivalent to
g_direct_equal(), but without a function call).
Manipulating the hash is simple. The routines are
summarized in Figure
31. Insertions do not copy
the key or value; these are entered into the table
exactly as you provide them, overwriting any pre-existing
key-value pair with the same key ("same" is defined by
your hash and equality functions, remember). If this is a
problem, you must do a lookup or remove before you
insert. Be especially careful if you dynamically allocate
keys or values.
The simple
g_hash_table_lookup() returns the value it finds
associated with key, or
NULL if there is no
value. Sometimes this won't do. For example, NULL may be a valid value in itself.
If you're using strings as keys, especially dynamically
allocated strings, knowing that a key is in the table
might not be enough; you might want to retrieve the exact
gchar* the hash table is
using to represent key
"foo". A second lookup function is provided for
cases like these.
g_hash_table_lookup_extended() returns TRUE if the lookup succeeded; if it
returns TRUE, it places
the key and value it found in the locations it's given.
GHashTable keeps an
internal array whose size is a prime number. It also
keeps a count of the number of key-value pairs stored in
the table. If the average number of pairs per available
slot drops below 0.3 (or so), the array is made smaller;
if it goes above 3, the array is made larger to reduce
collisions. Resizing happens automatically whenever you
insert or remove pairs from the table. This ensures the
hash table's memory use is optimal. Unfortunately, it is
inefficient to rebuild the hash table over and over if
you're doing a large number of insertions or removals. To
solve the problem, the hash table can be frozen, meaning that resizing is
temporarily suppressed. When you're done adding and
removing items, you simply thaw
the table, resulting in a single optimal-size
calculation. (Be careful though; a frozen table can end
up with many hash collisions if you add large quantities
of data. This should be fine as long as you thaw before
you do any lookups.) The functions are in Figure 32.