5.10.1. The Character Set Used for Data and Sorting
By default, MySQL uses the latin1
(cp1252
West European) character set and the
latin1_swedish_ci
collation that sorts
according to Swedish/Finnish rules. These defaults are suitable
for the United States and most of Western Europe.
All MySQL binary distributions are compiled with
--with-extra-charsets=complex
. This adds code
to all standard programs that enables them to handle
latin1
and all multi-byte character sets
within the binary. Other character sets are loaded from a
character-set definition file when needed.
The character set determines what characters are allowed in
identifiers. The collation determines how strings are sorted by
the ORDER BY
and GROUP BY
clauses of the SELECT
statement.
You can change the default server character set and collation
with the --character-set-server
and
--collation-server
options when you start the
server. The collation must be a legal collation for the default
character set. (Use the SHOW COLLATION
statement to determine which collations are available for each
character set.) See Section 5.2.1, “mysqld Command Options”.
The character sets available depend on the
--with-charset=charset_name
and
--with-extra-charsets=list-of-charsets
| complex | all | none
options to
configure, and the character set
configuration files listed in
SHAREDIR
/charsets/Index
.
See Section 2.8.2, “Typical configure Options”.
If you change the character set when running MySQL, that may
also change the sort order. Consequently, you must run
myisamchk -r -q
--set-collation=collation_name
on all tables, or your indexes may not be ordered correctly.
When a client connects to a MySQL server, the server indicates
to the client what the server's default character set is. The
client switches to this character set for this connection.
You should use mysql_real_escape_string()
when escaping strings for an SQL query.
mysql_real_escape_string()
is identical to
the old mysql_escape_string()
function,
except that it takes the MYSQL
connection
handle as the first parameter so that the appropriate character
set can be taken into account when escaping characters.
If the client is compiled with paths that differ from where the
server is installed and the user who configured MySQL didn't
include all character sets in the MySQL binary, you must tell
the client where it can find the additional character sets it
needs if the server runs with a different character set from the
client.
You can do this by specifying a
--character-sets-dir
option to indicate the
path to the directory in which the dynamic MySQL character sets
are stored. For example, you can put the following in an option
file:
[client]
character-sets-dir=/usr/local/mysql/share/mysql/charsets
You can force the client to use specific character set as
follows:
[client]
default-character-set=charset_name
This is normally unnecessary, however.