|
MySQL supports the following JOIN syntaxes
for the table_references part of
SELECT statements and multiple-table
DELETE and UPDATE
statements:
table_references:
table_reference [, table_reference ] ...
table_reference :
table_factor
| join_table
table_factor :
tbl_name [[AS] alias ]
[{USE|IGNORE|FORCE} INDEX (key_list )]
| ( table_references )
| { OJ table_reference LEFT OUTER JOIN table_reference
ON conditional_expr }
join_table :
table_reference [INNER | CROSS] JOIN table_factor [join_condition ]
| table_reference STRAIGHT_JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor ON condition
| table_reference LEFT [OUTER] JOIN table_reference join_condition
| table_reference NATURAL [LEFT [OUTER]] JOIN table_factor
| table_reference RIGHT [OUTER] JOIN table_reference join_condition
| table_reference NATURAL [RIGHT [OUTER]] JOIN table_factor
join_condition :
ON conditional_expr
| USING (column_list )
A table reference is also known as a join expression.
The syntax of table_factor is
extended in comparison with the SQL Standard. The latter
accepts only table_reference , not a
list of them inside a pair of parentheses.
This is a conservative extension if we consider each comma in
a list of table_reference items as
equivalent to an inner join. For example:
SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
is equivalent to:
SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
In MySQL, CROSS JOIN is a syntactic
equivalent to INNER JOIN (they can replace
each other. In standard SQL, they are not equivalent.
INNER JOIN is used with an
ON clause, CROSS JOIN is
used otherwise.
In general, parentheses can be ignored in join expressions
containing only inner join operations. MySQL also supports
nested joins (see Section 7.2.10, “Nested Join Optimization”).
You should generally not have any conditions in the
ON part that are used to restrict which
rows you want in the result set, but rather specify these
conditions in the WHERE clause. There are
exceptions to this rule.
The { OJ ... LEFT OUTER JOIN ...} syntax
shown in the preceding list exists only for compatibility with
ODBC. The curly braces in the syntax should be written
literally; they are not metasyntax as used elsewhere in syntax
descriptions.
-
A table reference can be aliased using
tbl_name AS
alias_name or
tbl_name alias_name :
SELECT t1.name, t2.salary FROM employee AS t1, info AS t2
WHERE t1.name = t2.name;
SELECT t1.name, t2.salary FROM employee t1, info t2
WHERE t1.name = t2.name;
The ON conditional is any conditional
expression of the form that can be used in a
WHERE clause.
-
If there is no matching row for the right table in the
ON or USING part in
a LEFT JOIN , a row with all columns set
to NULL is used for the right table.
You can use this fact to find rows in a table that have no
counterpart in another table:
SELECT table1.* FROM table1
LEFT JOIN table2 ON table1.id=table2.id
WHERE table2.id IS NULL;
This example finds all rows in table1
with an id value that is not present in
table2 (that is, all rows in
table1 with no corresponding row in
table2 ). This assumes that
table2.id is declared NOT
NULL . See
Section 7.2.9, “LEFT JOIN and RIGHT JOIN Optimization”.
-
The
USING(column_list )
clause names a list of columns that must exist in both
tables. If tables a and
b both contain columns
c1 , c2 , and
c3 , the following join compares
corresponding columns from the two tables:
a LEFT JOIN b USING (c1,c2,c3)
The NATURAL [LEFT] JOIN of two tables
is defined to be semantically equivalent to an
INNER JOIN or a LEFT
JOIN with a USING clause that
names all columns that exist in both tables.
INNER JOIN and ,
(comma) are semantically equivalent in the absence of a
join condition: both produce a Cartesian product between
the specified tables (that is, each and every row in the
first table is joined to each and every row in the second
table).
RIGHT JOIN works analogously to
LEFT JOIN . To keep code portable across
databases, it is recommended that you use LEFT
JOIN instead of RIGHT JOIN .
STRAIGHT_JOIN is identical to
JOIN , except that the left table is
always read before the right table. This can be used for
those (few) cases for which the join optimizer puts the
tables in the wrong order.
You can provide hints as to which index MySQL should use when
retrieving information from a table. By specifying
USE INDEX
(key_list ) , you can tell
MySQL to use only one of the possible indexes to find rows in
the table. The alternative syntax IGNORE INDEX
(key_list ) can be used to
tell MySQL to not use some particular index. These hints are
useful if EXPLAIN shows that MySQL is using
the wrong index from the list of possible indexes.
You can also use FORCE INDEX , which acts
like USE INDEX
(key_list ) but with the
addition that a table scan is assumed to be
very expensive. In other words, a table
scan is used only if there is no way to use one of the given
indexes to find rows in the table.
USE INDEX , IGNORE INDEX ,
and FORCE INDEX affect only which indexes
are used when MySQL decides how to find rows in the table and
how to do the join. They do not affect whether an index is
used when resolving an ORDER BY or
GROUP BY .
USE KEY , IGNORE KEY , and
FORCE KEY are synonyms for USE
INDEX , IGNORE INDEX , and
FORCE INDEX .
Some join examples:
SELECT * FROM table1,table2 WHERE table1.id=table2.id;
SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id;
SELECT * FROM table1 LEFT JOIN table2 USING (id);
SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id
LEFT JOIN table3 ON table2.id=table3.id;
SELECT * FROM table1 USE INDEX (key1,key2)
WHERE key1=1 AND key2=2 AND key3=3;
SELECT * FROM table1 IGNORE INDEX (key3)
WHERE key1=1 AND key2=2 AND key3=3;
Note: Natural joins and joins
with USING , including outer join variants,
are processed according to the SQL:2003 standard. These
changes make MySQL more compliant with standard SQL. However,
they can result in different output columns for some joins.
Also, some queries that appeared to work correctly in older
versions (prior to 5.0.12) must be rewritten to comply with
the standard. The following list provides more detail about
several effects of current join processing versus join
processing in older versions. The term
“previously” means “prior to MySQL
5.0.12.”
-
The columns of a NATURAL join or a
USING join may be different from
previously. Specifically, redundant output columns no
longer appear, and the order of columns for
SELECT * expansion may be different
from before.
Consider this set of statements:
CREATE TABLE t1 (i INT, j INT);
CREATE TABLE t2 (k INT, j INT);
INSERT INTO t1 VALUES(1,1);
INSERT INTO t2 VALUES(1,1);
SELECT * FROM t1 NATURAL JOIN t2;
SELECT * FROM t1 JOIN t2 USING (j);
Previously, the statements produced this output:
+------+------+------+------+
| i | j | k | j |
+------+------+------+------+
| 1 | 1 | 1 | 1 |
+------+------+------+------+
+------+------+------+------+
| i | j | k | j |
+------+------+------+------+
| 1 | 1 | 1 | 1 |
+------+------+------+------+
In the first SELECT statement, column
i appears in both tables and thus
becomes a join column, so, according to standard SQL, it
should appear only once in the output, not twice.
Similarly, in the second SELECT statement, column
j is named in the
USING clause and should appear only
once in the output, not twice. But in both cases, the
redundant column is not eliminated. Also, the order of the
columns is not correct according to standard SQL.
Now the statements produce this output:
+------+------+------+
| j | i | k |
+------+------+------+
| 1 | 1 | 1 |
+------+------+------+
+------+------+------+
| j | i | k |
+------+------+------+
| 1 | 1 | 1 |
+------+------+------+
The redundant column is eliminated. Also, the column order
is correct according to standard SQL:
First, columns common to both tables, in the order in
which they occur in the first table
Second, columns unique to the first table, in order in
which they occur in that table
Third, columns unique to the second table, in order in
which they occur in that table
-
The evaluation of multi-way natural joins differs in a way
that can require query rewriting. Suppose that you have
three tables t1(a,b) ,
t2(c,b) , and t3(a,c)
that each have one row: t1(1,2) ,
t2(10,2) , and
t3(7,10) . Suppose also that you have
this NATURAL JOIN on the three tables:
SELECT ... FROM t1 NATURAL JOIN t2 NATURAL JOIN t3;
Previously, the left operand of the second join was
considered to be t2 , whereas it should
be the nested join (t1 NATURAL JOIN
t2) . As a result, the columns of
t3 are checked for common columns only
in t2 , and, if t3
has common columns with t1 , these
columns are not used as equi-join columns. Thus,
previously, the preceding query was transformed to the
following equi-join:
SELECT ... FROM t1, t2, t3
WHERE t1.b = t2.b AND t2.c = t3.c;
That join is missing one more equi-join predicate
(t1.a = t3.a) . As a result, it produces
one row, not the empty result that it should. The correct
equivalent query is this:
SELECT ... FROM t1, t2, t3
WHERE t1.b = t2.b AND t2.c = t3.c AND t1.a = t3.a;
If you require the same query result in current versions
of MySQL as in older versions, rewrite the natural join as
the first equi-join.
-
Previously, the comma operator (, ) and
JOIN both had the same precedence, so
the join expression t1, t2 JOIN t3 was
interpreted as ((t1, t2) JOIN t3) . Now
JOIN has higher precedence, so the
expression is interpreted as (t1, (t2 JOIN
t3)) . This change affects statements that use an
ON clause, because that clause can
refer only to columns in the operands of the join, and the
change in precedence changes interpretation of what those
operands are.
Example:
CREATE TABLE t1 (i1 INT, j1 INT);
CREATE TABLE t2 (i2 INT, j2 INT);
CREATE TABLE t3 (i3 INT, j3 INT);
INSERT INTO t1 VALUES(1,1);
INSERT INTO t2 VALUES(1,1);
INSERT INTO t3 VALUES(1,1);
SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3);
Previously, the SELECT was legal due to
the implicit grouping of t1,t2 as
(t1,t2) . Now the
JOIN takes precedence, so the operands
for the ON clause are
t2 and t3 . Because
t1.i1 is not a column in either of the
operands, the result is an Unknown column 't1.i1'
in 'on clause' error. To allow the join to be
processed, group the first two tables explicitly with
parentheses so that the operands for the
ON clause are
(t1,t2) and t3 :
SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);
Alternatively, avoid the use of the comma operator and use
JOIN instead:
SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);
This change also applies to INNER JOIN ,
CROSS JOIN , LEFT
JOIN , and RIGHT JOIN , all of
which now have higher precedence than the comma operator.
-
Previously, the ON clause could refer
to columns in tables named to its right. Now an
ON clause can refer only to its
operands.
Example:
CREATE TABLE t1 (i1 INT);
CREATE TABLE t2 (i2 INT);
CREATE TABLE t3 (i3 INT);
SELECT * FROM t1 JOIN t2 ON (i1 = i3) JOIN t3;
Previously, the SELECT statement was
legal. Now the statement fails with an Unknown
column 'i3' in 'on clause' error because
i3 is a column in
t3 , which is not an operand of the
ON clause. The statement should be
rewritten as follows:
SELECT * FROM t1 JOIN t2 JOIN t3 ON (i1 = i3);
-
Previously, a USING clause could be
rewritten as an ON clause that compares
corresponding columns. For example, the following two
clauses are semantically identical:
a LEFT JOIN b USING (c1,c2,c3)
a LEFT JOIN b ON a.c1=b.c1 AND a.c2=b.c2 AND a.c3=b.c3
Now the two clauses no longer are quite the same:
With respect to determining which rows satisfy the
join condition, both joins remain semantically
identical.
-
With respect to determining which columns to display
for SELECT * expansion, the two
joins are not semantically identical. The
USING join selects the coalesced
value of corresponding columns, whereas the
ON join selects all columns from
all tables. For the preceding USING
join, SELECT * selects these
values:
COALESCE(a.c1,b.c1), COALESCE(a.c2,b.c2), COALESCE(a.c3,b.c3)
For the ON join, SELECT
* selects these values:
a.c1, a.c2, a.c3, b.c1, b.c2, b.c3
With an inner join,
COALESCE(a.c1,b.c1) is the same as
either a.c1 or
b.c1 because both columns will have
the same value. With an outer join (such as
LEFT JOIN ), one of the two columns
can be NULL . That column will be
omitted from the result.
|
|