While joins are used in SQL to combine column values into a single row, the UNION,
INTERSECT and EXCEPT clauses exist to merge or omit row
data by comparing column values, returning a new result set based on this comparison. Each of these keywords may be used
at the end of a valid SQL query and followed by a second query, in order to compare the resultant data sets, and then either
merge or omit rows based on that comparison.
When comparing data sets in this manner, it is required that they each have the same number of columns, as well as the same column type.
Note that they do not need to have the same name, or be queried from the same table or data source.
-
UNION
-
A pair of queries merged with the UNION keyword will combine all
non-distinct rows into a single data set. Like rows will not be duplicated.
-
INTERSECT
-
A pair of queries merged with the INTERSECT keyword will cause
any rows not found in both data sets to be omitted. As such, the only rows returned are those that overlap
between the two query result sets.
-
EXCEPT
-
A pair of queries merged with the EXCEPT keyword will cause any rows
found in both data sets to be omitted from the returned data set. As such, only rows found in the query to the
left
of the EXCEPT clause that are
not
found in the query to the right of the clause will be returned.
Example 4-46, Example 4-47, and Example 4-48 each demonstrate
these keywords by combining and omitting rows from comparative data sets. Example 4-46 creates a result
set by combining several authors' last names with book titles via the UNION
keyword.
Example 4-47 demonstrates the selection of ISBN numbers from the books
table, limited to rows which intersect with the query on the shipments table for books which
have records of more than two shipments. Finally, Example 4-48 demonstrates the removal of any rows from the
first query which are matched completely in the second.
Example 4-46. Using UNION
booktown=#
SELECT title FROM books
booktown-#
UNION
booktown-#
SELECT last_name FROM authors
booktown-#
LIMIT 11;
title
-----------------------------
2001: A Space Odyssey
Alcott
Bartholomew and the Oobleck
Bianco
Bourgeois
Brautigan
Brite
Brown
Christiansen
Clarke
Denham
(11 rows)
Example 4-47. Using INTERSECT
booktown=#
SELECT isbn FROM editions
booktown-#
INTERSECT
booktown-#
SELECT isbn FROM shipments
booktown-#
GROUP BY isbn
booktown-#
HAVING count(id) > 2;
isbn
------------
039480001X
0394800753
0451160916
0590445065
0694003611
(5 rows)
Example 4-48. Using EXCEPT
booktown=#
SELECT last_name, first_name
booktown-#
FROM authors
booktown-#
EXCEPT
booktown-#
SELECT last_name, first_name
booktown-#
FROM authors AS a (author_id)
booktown-#
NATURAL INNER JOIN books
booktown-#
ORDER BY first_name ASC;
last_name | first_name
-----------+------------
Denham | Ariel
Gorey | Edward
Brite | Poppy Z.
Brautigan | Richard
(4 rows)
In Example 4-48, only rows that do not match the second query are returned. Notice that the
effective result of this is that only authors who do not have a book in the books table
are returned. This is due to the INNER JOIN clause, which causes the second query to
omit any authors whose author_id is not found in the
books table.
While the use of these keywords in a single SQL query precludes the ability to use the
LIMIT clause, this limitation can be circumvented by PostgreSQL's support for
sub-queries. By grouping in parentheses each of the queries involved between a UNION,
EXCEPT, or EXCEPT clause, the returned result sets
from the sub-queries are compared, as demonstrated in Example 4-49.
Example 4-49. Comparing sub-query result sets
booktown=#
(SELECT title FROM books ORDER BY title DESC LIMIT 7)
booktown-#
EXCEPT
booktown-#
(SELECT title FROM books ORDER BY title ASC LIMIT 11)
booktown-#
ORDER BY title DESC;
title
----------------------
The Velveteen Rabbit
The Tell-Tale Heart
The Shining
The Cat in the Hat
(4 rows)
Notice that the query used in Example 4-49 creates a set from the books
table that is constrained to the last seven rows and sorted alphabetically by title.
The EXCEPT clause then removes from that data set the first eleven rows, sorted
alphabetically in an ascending fashion. The result consists of the
last four rows from the table, sorted from the bottom by the final ORDER BY clause on
the new exception set.