|
10.9.2. West European Character Sets
Western European character sets cover most West European
languages, such as French, Spanish, Catalan, Basque, Portuguese,
Italian, Albanian, Dutch, German, Danish, Swedish, Norwegian,
Finnish, Faroese, Icelandic, Irish, Scottish, and English.
-
ascii (US ASCII) collations:
-
cp850 (DOS West European) collations:
-
dec8 (DEC Western European) collations:
-
hp8 (HP Western European) collations:
hp8_bin
hp8_english_ci (default)
-
latin1 (cp1252 West European) collations:
latin1 is the default character set.
MySQL's latin1 is the same as the Windows
cp1252 character set. This means it is
the same as the official ISO 8859-1 or
IANA (Internet Assigned Numbers Authority)
latin1 , but IANA
latin1 treats the code points between
0x80 and 0x9f as
“undefined,” whereas cp1252 ,
and therefore MySQL's latin1 , assign
characters for those positions. For example,
0x80 is the Euro sign. For the
“undefined” entries in
cp1252 , MySQL translates
0x81 to Unicode
0x0081 , 0x8d to
0x008d , 0x8f to
0x008f , 0x90 to
0x0090 , and 0x9d to
0x009d .
The latin1_swedish_ci collation is the
default that probably is used by the majority of MySQL
customers. Although it is frequently said that it is based
on the Swedish/Finnish collation rules, there are Swedes and
Finns who disagree with this statement.
The latin1_german1_ci and
latin1_german2_ci collations are based on
the DIN-1 and DIN-2 standards, where DIN stands for
Deutsches Institut für
Normung (the German equivalent of ANSI).
DIN-1 is called the “dictionary collation” and
DIN-2 is called the “phone book collation.”
-
latin1_german1_ci (dictionary) rules:
Ä = A
Ö = O
Ü = U
ß = s
-
latin1_german2_ci (phone-book) rules:
Ä = AE
Ö = OE
Ü = UE
ß = ss
In the latin1_spanish_ci collation,
‘ñ ’ (n-tilde) is a separate
letter between ‘n ’ and
‘o ’.
-
macroman (Mac West European) collations:
-
swe7 (7bit Swedish) collations:
|
|