An
aggregate function
is a special kind of function that operates on
several rows
of a query at once, returning a single result. Such functions are generally only used in queries which make
use of the GROUP BY clause to associate rows together by like criteria, though they may
be used in queries which only contain aggregate functions in their target list. When performing the latter, the
aggregate function operates on
all
selected rows from the result set.
Table 5-16 provides an overview of PostgreSQL's supported aggregate functions. To see a
complete list of aggregate functions, you may use the \da command within
psql
.
Table 5-16. Aggregate functions
Function
|
Description
|
avg(
expression
)
|
Returns the average of the
expression
values from all rows in a group.
|
count(
expression
)
|
Returns the number of values, per each aggregated group of rows, for which
expression
is not NULL
|
max(
expression
)
|
Returns the maximum value of
expression
in the grouped rows
|
min(
expression
)
|
Returns the minimum value of
expression
in the grouped rows
|
stddev(
expression
)
|
Returns the standard deviation of the values of
expression
in the grouped rows
|
sum(
expression
)
|
Returns the sum of the values of
expression
in the grouped rows
|
variance(
expression
)
|
Returns the variance of the values of
expression
in the grouped rows
|
The following sections describe each aggregate function in further detail, including specific information on
usage, examples, and valid input data types. In each of the functional explanations, the term
expression
refers to any valid identifier in a result set, or any valid expression operating on such
an identifier.
When calling an aggregate function,
aggregate expressions
are employed to describe an
expression from the result set created by the SELECT statement. An aggregate expression
is similar to an ordinary SQL expression, but may be preceded by either the ALL or the
DISTINCT keyword.
The use of the DISTINCT keyword in an aggregate expression causes only grouped
rows with unique values (as described by the expression) to be evaluated by the function. Any duplicate rows will be
suppressed. Similar to the use of the ALL keyword in a
SELECT statement, the use of ALL in an aggregate
expression has no function other than to make more explicit the request for all grouped rows to be evaluated to the
function. Example 5-19 demonstrates each of the aggregate expression forms.
Example 5-19. Using aggregate expressions
booktown=#
SELECT count(location) AS set_locations,
booktown-#
count(ALL location) AS all_set_locations,
booktown-#
count(DISTINCT location) AS unique_locations,
booktown-#
count(*) AS all_rows
booktown-#
FROM subjects;
set_locations | all_set_locations | unique_locations | all_rows
---------------+-------------------+------------------+----------
15 | 15 | 7 | 16
(1 row)
There is one final form of aggregate expression, as demonstrated by the all_rows
result column in Example 5-19. When the asterisk (*)
symbol is supplied as the aggregate expression, it instructs the aggregate function to evaluate
all rows
,
including rows with values of NULL, which are ordinarily ignored. Since
the subjects table contains one row with a NULL value
in the location column, the counted rows for location
differ from those counted for *.
Warning
|
Rows whose evaluated aggregate expression contain NULL values will not be evaluated
by an aggregate function (with the exception of the count() function).
|
avg(
expression
)
The avg() function accepts an expression describing aggregated values that are
either of any numeric type (numeric, bigint,
smallint, real, or double precision),
or of the interval time type.
The average, or mean, of the values described by
expression
in the grouped rows is returned.
The resultant value is returned as a value of type numeric for expressions of type
integer and double precision for expressions of type
real. All other expression types cause a value of the same data type to be returned.
Examples
booktown=#
SELECT avg(cost) AS average_cost,
booktown-#
avg(retail) AS average_price,
booktown-#
avg(retail - cost) AS average_profit
booktown-#
FROM stock;
average_cost | average_price | average_profit
---------------+---------------+----------------
24.8235294118 | 30.0088235294 | 5.1852941176
(1 row)
booktown=#
SELECT avg(cost) AS average_cost, p.name AS publisher
booktown-#
FROM (stock JOIN editions USING (isbn))
booktown-#
JOIN publishers AS p (publisher_id)
booktown-#
USING (publisher_id)
booktown-#
GROUP BY p.name;
average_cost | publisher
---------------+-----------------------------
26.5000000000 | Ace Books
19.0000000000 | Books of Wonder
26.5000000000 | Doubleday
25.0000000000 | HarperCollins
18.0000000000 | Henry Holt & Company, Inc.
23.0000000000 | Kids Can Press
23.0000000000 | Mojo Press
20.0000000000 | Penguin
23.0000000000 | Random House
26.5000000000 | Roc
26.0000000000 | Watson-Guptill Publications
(11 rows)
count(
expression
)
The count() function returns the number of values in a set of aggregated rows
where the
expression
is not NULL. The
count() is not restricted as to the data type described by
expression
. It is important to understand that the count() function
only counts values which are not NULL. As a result, it is important to use an
expression
whose value will not be returned NULL in order for the
expression
to be meaningful to the counted results.
You may pass the asterisk (*) character to
count() in order to simply count all rows in an aggregation (including rows with
NULL values).
Examples
booktown=#
SELECT count(*) FROM editions;
count
-------
17
(1 row)
booktown=#
SELECT count(isbn), p.name
booktown-#
FROM editions JOIN publishers AS p (publisher_id)
booktown-#
USING (publisher_id)
booktown-#
GROUP BY p.name
booktown-#
ORDER BY count DESC;
count | name
-------+-----------------------------
3 | Random House
2 | Ace Books
2 | Doubleday
2 | Roc
1 | Books of Wonder
1 | HarperCollins
1 | Henry Holt & Company, Inc.
1 | Kids Can Press
1 | Mojo Press
1 | O'Reilly & Associates
1 | Penguin
1 | Watson-Guptill Publications
(12 rows)
max(
expression
)
The max() function returns the maximum found value described by
expression
in a set of aggregated rows. It accepts an
expression
that
may represent any numeric, string, date, or time data type. The maximum is returned as a value of the same
data type as the
expression
.
Examples
booktown=#
SELECT max(cost), max(retail) FROM stock;
max | max
-------+-------
36.00 | 46.95
(1 row)
booktown=#
SELECT max(retail), p.name
booktown-#
FROM (stock NATURAL JOIN editions)
booktown-#
JOIN publishers AS p (publisher_id)
booktown-#
USING (publisher_id)
booktown-#
GROUP BY p.name
booktown-#
ORDER BY max DESC;
max | name
-------+-----------------------------
46.95 | Roc
45.95 | Ace Books
36.95 | Doubleday
32.95 | Random House
28.95 | HarperCollins
28.95 | Watson-Guptill Publications
24.95 | Mojo Press
24.95 | Penguin
23.95 | Henry Holt & Company, Inc.
23.95 | Kids Can Press
21.95 | Books of Wonder
(11 rows)
min(
expression
)
The min() function returns the minimum found value described by
expression
in a set of aggregated rows. It accepts an
expression
which
may represent any numeric, string, date, or time data type. The minimum is returned as a value of the same
data type as the
expression
.
Examples
booktown=#
SELECT min(cost), min(retail) FROM stock;
min | min
-------+-------
16.00 | 16.95
(1 row)
booktown=#
SELECT min(retail), p.name
booktown-#
FROM (stock NATURAL JOIN editions)
booktown-#
JOIN publishers AS p (publisher_id)
booktown-#
USING (publisher_id)
booktown-#
GROUP BY p.name
booktown-#
ORDER BY min ASC;
min | name
-------+-----------------------------
16.95 | Random House
21.95 | Ace Books
21.95 | Books of Wonder
22.95 | Roc
23.95 | Henry Holt & Company, Inc.
23.95 | Kids Can Press
24.95 | Mojo Press
24.95 | Penguin
28.95 | Doubleday
28.95 | HarperCollins
28.95 | Watson-Guptill Publications
(11 rows)
stddev(
expression
)
The stddev() function accepts an expression describing values of any numeric type
(numeric, bigint,
smallint, real, or double precision),
and returns the standard deviation of the values within the aggregated rows. The resultant value is returned as double precision
for an expression describing floating point values, and numeric for all other types.
Examples
booktown=#
SELECT stddev(retail) FROM stock;
stddev
--------
8.46
(1 row)
booktown=#
SELECT stddev(retail), p.name
booktown-#
FROM (stock NATURAL JOIN editions)
booktown-#
JOIN publishers AS p ON (publisher_id = p.id)
booktown-#
GROUP BY p.name
booktown-#
ORDER BY stddev DESC
booktown-#
LIMIT 4;
stddev | name
--------+--------------
16.97 | Ace Books
16.97 | Roc
8.02 | Random House
5.66 | Doubleday
(4 rows)
sum(
expression
)
The sum() function accepts an expression describing values of any numeric type
(numeric, bigint,
smallint, real, or double precision),
and returns the sum of the values within the aggregated rows. The returned value is of the type numeric
when operating on values of type integer and double precision when
operating on values of type real. The result is returned as the same data type as the values
described by
expression
for all other data types.
Examples
booktown=#
SELECT sum(stock) FROM stock
;
sum
-----
508
(1 row)
booktown=#
SELECT sum(stock), s.subject
booktown-#
FROM ((stock NATURAL JOIN editions)
booktown(#
JOIN books ON (books.id = book_id))
booktown-#
JOIN subjects AS s
booktown-#
ON (books.subject_id = s.id)
booktown-#
GROUP BY s.subject
booktown-#
ORDER BY sum DESC;
sum | subject
-----+------------------
189 | Horror
166 | Science Fiction
91 | Children's Books
28 | Drama
18 | Classics
16 | Arts
(6 rows)
variance(
expression
)
The variance() function accepts an expression describing values of any numeric type
(numeric, bigint,
smallint, real, or double precision)
and returns the variance of the values within the aggregated rows. The variance is equivalent to the stddev() squared.
The resultant value is returned as double precision for an expression describing
floating-point values, and numeric for all other types.
Examples
booktown=#
SELECT variance(retail) FROM stock;
variance
----------
71.60
(1 row)
booktown=#
SELECT variance(retail), p.name
booktown-#
FROM (stock NATURAL JOIN editions)
booktown-#
JOIN publishers AS p
booktown-#
ON (editions.publisher_id = p.id)
booktown-#
GROUP BY p.name
booktown-#
ORDER BY variance DESC
booktown-#
LIMIT 4;
variance | name
----------+-----------------------------
288.00 | Ace Books
288.00 | Roc
64.33 | Random House
32.00 | Doubleday
(4 rows)