5.10.4. The Character Definition Arrays
to_lower[]
and to_upper[]
are simple arrays that hold the lowercase and uppercase
characters corresponding to each member of the character set.
For example:
to_lower['A'] should contain 'a'
to_upper['a'] should contain 'A'
sort_order[]
is a map indicating how
characters should be ordered for comparison and sorting
purposes. Quite often (but not for all character sets) this is
the same as to_upper[]
, which means that
sorting is case-insensitive. MySQL sorts characters based on the
values of sort_order[]
elements. For more
complicated sorting rules, see the discussion of string
collating in Section 5.10.5, “String Collating Support”.
ctype[]
is an array of bit values, with one
element for one character. (Note that
to_lower[]
, to_upper[]
,
and sort_order[]
are indexed by character
value, but ctype[]
is indexed by character
value + 1. This is an old legacy convention for handling
EOF
.)
You can find the following bitmask definitions in
m_ctype.h
:
#define _U 01 /* Uppercase */
#define _L 02 /* Lowercase */
#define _N 04 /* Numeral (digit) */
#define _S 010 /* Spacing character */
#define _P 020 /* Punctuation */
#define _C 040 /* Control character */
#define _B 0100 /* Blank */
#define _X 0200 /* heXadecimal digit */
The ctype[]
entry for each character should
be the union of the applicable bitmask values that describe the
character. For example, 'A'
is an uppercase
character (_U
) as well as a hexadecimal digit
(_X
), so ctype['A'+1]
should contain the value:
_U + _X = 01 + 0200 = 0201