MySQL has many operators and functions that return a string.
This section answers the question: What is the character set and
collation of such a string?
For simple functions that take string input and return a string
result as output, the output's character set and collation are
the same as those of the principal input value. For example,
UPPER(X
)
returns a
string whose character string and collation are the same as that
of X
. The same applies for
INSTR()
, LCASE()
,
LOWER()
, LTRIM()
,
MID()
, REPEAT()
,
REPLACE()
, REVERSE()
,
RIGHT()
, RPAD()
,
RTRIM()
, SOUNDEX()
,
SUBSTRING()
, TRIM()
,
UCASE()
, and UPPER()
.
Note: The REPLACE()
function, unlike all
other functions, always ignores the collation of the string
input and performs a case-sensitive comparison.
If a string input or function result is a binary string, the
string has no character set or collation. This can be check by
using the CHARSET()
and
COLLATION()
functions, both of which return
binary
to indicate that their argument is a
binary string:
mysql> SELECT CHARSET(BINARY 'a'), COLLATION(BINARY 'a');
+---------------------+-----------------------+
| CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
+---------------------+-----------------------+
| binary | binary |
+---------------------+-----------------------+
For operations that combine multiple string inputs and return a
single string output, the “aggregation rules” of
standard SQL apply for determining the collation of the result:
If an explicit COLLATE
X
occurs, use
X
.
If explicit COLLATE
X
and COLLATE
Y
occur, raise an
error.
Otherwise, if all collations are
X
, use
X
.
Otherwise, the result has no collation.
For example, with CASE ... WHEN a THEN b WHEN b THEN c
COLLATE X
END
, the
resulting collation is X
. The same
applies for CASE
, UNION
,
||
, CONCAT()
,
ELT()
, GREATEST()
,
IF()
, and LEAST()
.
For operations that convert to character data, the character set
and collation of the strings that result from the operations are
defined by the character_set_connection
and
collation_connection
system variables. This
applies to CAST()
, CHAR()
,
CONV()
, FORMAT()
,
HEX()
, and SPACE()
.