MySQL 5.1 Database Reference Guide :: 5.10.4 The Character Definition Arrays

Previous / Next / Up / Table of Contents

5.10.4. The Character Definition Arrays

$8.99

to_lower[] and to_upper[] are simple arrays that hold the lowercase and uppercase characters corresponding to each member of the character set. For example:

to_lower['A'] should contain 'a'
to_upper['a'] should contain 'A'

sort_order[] is a map indicating how characters should be ordered for comparison and sorting purposes. Quite often (but not for all character sets) this is the same as to_upper[], which means that sorting is case-insensitive. MySQL sorts characters based on the values of sort_order[] elements. For more complicated sorting rules, see the discussion of string collating in Section 5.10.5, “String Collating Support”.

ctype[] is an array of bit values, with one element for one character. (Note that to_lower[], to_upper[], and sort_order[] are indexed by character value, but ctype[] is indexed by character value + 1. This is an old legacy convention for handling EOF.)

You can find the following bitmask definitions in m_ctype.h:

#define _U      01      /* Uppercase */
#define _L      02      /* Lowercase */
#define _N      04      /* Numeral (digit) */
#define _S      010     /* Spacing character */
#define _P      020     /* Punctuation */
#define _C      040     /* Control character */
#define _B      0100    /* Blank */
#define _X      0200    /* heXadecimal digit */

The ctype[] entry for each character should be the union of the applicable bitmask values that describe the character. For example, 'A' is an uppercase character (_U) as well as a hexadecimal digit (_X), so ctype['A'+1] should contain the value:

_U + _X = 01 + 0200 = 0201

Previous / Next / Up / Table of Contents