As documented in Chapter 3, PostgreSQL supports non-atomic values in individual table columns
through data constructs called arrays. An array itself is not a data type, but an extension of any
PostgreSQL data type.
A simple array column is created by appending a pair of square brackets to the data type of the intended array
column within a CREATE TABLE or ALTER TABLE statement.
These brackets indicate that more than a single value of the described data type may be inserted without limit into that
column. For example:
single_array type[] -- A single array of values.
Additional square brackets may be added to create multidimensional arrays, which
may store an array of array values. For example:
multi_array type[][] -- A multidimensional array of values.
In theory, an integer value n could be supplied within the square brackets to produce a
fixed-length array (one which always has n members, and no more). As of
PostgreSQL 7.1.x, however, this restriction is not enforced, and there is no practical difference between an array
created with a fixed length and one created without.
Example 7-16 creates a table named
favorite_books. This table associates an integer value of an employee with a
one-dimensional character string array of type text called
books.
Example 7-16. Creating a table with an array column
The table created by Example 7-16 allows any number of book titles to be stored
in a single array column, for each employee. The advantage of such an array of discrete text values over a single text
string (which also, of course, could contain multiple titles) is that each title is kept physically separate from each
other title in the array column. Since the system knows where each array value begins and ends, you can choose titles by
their subscript, rather than having to manually parse them out of a long text string.
Creating a multidimensional array column is very similar. The only distinction is that another pair of square
brackets follows the first pair, as shown earlier in this section. Example 7-17 creates a table called
favorite_authors, with an employee_id column of type
integer, and multidimensional text array of
author_and_titles. This essentially creates an array of text arrays.
Example 7-17. Creating a table with a multidimensional array column
A special kind of syntax is used in order to insert multiple values into a single column. This syntax allows you to
describe an array constant. As documented in Chapter 3, the syntax of an array constant (for referring to PostgreSQL
array values in SQL statements) is a special arrangement of curly braces, double-quotes and commas, all bound by single-quotes.
Double quotes are required only when working with an array of character strings. Therefore, the general forms of array constants are:
'{ "text1" [, ...] }' -- A character string array.
'{ numeric [, ...] }' -- A numeric array.
These syntax forms illustrate how to handle string and numeric arrays, but a column may be defined as an array of any
arbitrary type (including boolean, date, and
time types). Generally, if you would ordinarily use single-quotes to describe a value in a
non-array context (such as with a string constant, or timestamp value), double-quotes should be used for that value in an
array constant.
Example 7-18 inserts a pair of records into the
favorite_books table. The first statement inserts a single favorite book for the employee
with id 102, and the second statement inserts two titles for the employee with id 103.
Example 7-18 executes two SQL INSERT statements, which
insert a pair of array constant values.
Example 7-18. Inserting array constants
booktown=# INSERT INTO favorite_books VALUES
booktown-# (102, '{"The Hitchhiker\'s Guide to the Galaxy"}');
INSERT 3628399 1
booktown=# INSERT INTO favorite_books VALUES
booktown-# (103, '{"The Hobbit", "Kitten, Squared"}');
INSERT 3628400 1
Notice that, in Example 7-18, curly braces are still required to insert a single value
into an array. Notice also that the single-quote in the title (first INSERT statement) still requires a backslash preceding it, even though it is
surrounded by double-quotes. This is because the array constant itself is parsed as if it were one
long string constant, and subsequently interpreted as an array based on the context of its target column.
The insertion of values into a multidimensional array requires a pair of curly braces for each array; an array of
arrays must therefore itself be bound in curly braces, while each of its member arrays should be separated by one another
with commas. Example 7-19 inserts a single row containing a multidimensional
array constant into the favorite_authors table, created in Example 7-17.
Example 7-19. Inserting values into multidimensional arrays
Notice that the inserted multidimensional array in Example 7-19
contains three text arrays, which each have two members. There is no systematic relationship between these arrays, though
the implied relationship from the context is that the first members of each array are authors corresponding to the second
members of each array, which are the favorite titles from the associated author.
Selecting an array column from a table will result in the entire array being returned in the same constant format
described in the previous section. Example 7-20 retrieves the entire arrays for inserted
rows in the books column of the favorite_books
table.
Example 7-20. Selecting entire array values
booktown=# SELECT books FROM favorite_books;
books
------------------------------------------
{"The Hitchhiker's Guide to the Galaxy"}
{"The Hobbit","Kitten, Squared"}
(2 rows)
While it is helpful to be able to return the entire array, the ability to retrieve only a specific portion of an
array is often more useful. To this end, you need to learn how to work with array subscripts and slices.
The usefulness of arrays lies largely in the fact that you can use subscripts to specify the
value that you wish to view. A subscript is an integer value surrounded by square brackets, that describes the value you
want to select. This number describes the precedence of the value you wish to select, from left to right in the
array.
Unlike arrays in programming languages such as C, PostgreSQL begins counting array elements at 1, not
0. Example 7-21 uses the [1]
subscript on the books column of the favorite_books table
to select only the first of an employee's favorite titles. Notice that the query returns values without braces or double-quotes.
This is because a single text value need only be returned as a single text constant, not as an array.
Example 7-21. Selecting array values with subscripts
booktown=# SELECT books[1] FROM favorite_books;
books
--------------------------------------
The Hitchhiker's Guide to the Galaxy
The Hobbit
(2 rows)
Specifying a subscript for an array element which holds no value results in a NULL
value being selected. The IS NOT NULL keywords may be useful in conjunction with such
selections. Example 7-22 demonstrates two queries; the first returns two rows,
NULL value as well as a title. The second query only returns the title (and not the row
with the NULL value as a result of its use of the WHERE
clause, with the IS NOT NULL condition.
Example 7-22. Avoiding NULL values in arrays
booktown=# SELECT books[2] FROM favorite_books;
books
-----------------
Kitten, Squared
(2 rows)
booktown=# SELECT books[2] FROM favorite_books
booktown-# WHERE books[2] IS NOT NULL;
books
-----------------
Kitten, Squared
(1 row)
Selecting from a multidimensional array requires an additional subscript following the initial subscript. The first
subscript refers to which array that you are retrieving data from, while the second subscript refers to which member of
the specified array is to be retrieved. Example 7-23 demonstrates selecting the
first author, and associated title for that author, from the favorite_authors table created
in Example 7-19.
Example 7-23. Selecting From a Multi-Dimensional Array
booktown=# SELECT authors_and_titles[1][1] AS author,
booktown-# authors_and_titles[1][2] AS title
booktown-# FROM favorite_authors;
author | title
----------------+------------------
J.R.R. Tolkien | The Silmarillion
(1 row)
PostgreSQL also supports slices in array selection. These are similar to array subscripts, but
describe a range of values to be returned. The syntax of a slice is a pair of integers, separated by a
colon (:), surrounded by square brackets. For example,
[2:5] specifies the second, third, fourth, and fifth array values of a given
array. The result of a slice selection is returned as an array constant that is essentially a sub-array of the entire
array (though a slice may extend from the beginning to the end of an array).
Example 7-24 selects the range of the first two book titles in the
books test array column from the favorite_books table.
Even though the first returned row has only one title, it is still returned as an array with one member value.
Example 7-24. Selecting array values with slices
booktown=# SELECT books[1:2] FROM favorite_books;
books
------------------------------------------
{"The Hitchhiker's Guide to the Galaxy"}
{"The Hobbit","Kitten, Squared"}
(2 rows)
Array slices can be somewhat unpredictable with multidimensional arrays as of PostgreSQL 7.1.x. It is therefore
recommended to stick to exact subscript values when working with multidimensional arrays until this support is
improved.
It can be useful to know the number of values stored in an array. You may use the
array_dims() function to accomplish this. It accepts as a parameter a single identifier,
which is the name of the array column that you wish to perform the function on. The result is returned as a character
string describing the array with the same syntax used in array slices. Example 7-25 calls the
array_dims() function on the books column of the
favorite_books table.
Values in array columns may be modified in one of three ways:
Complete modification
The entire array may be replaced with a new array constant.
Slice modification
A slice of an array (range between two values) may be replaced with a new array constant. The new array
constant should have the same number of values within it as the splice to be updated.
Element modification
An individual value in the array may be replaced with a new constant of the base type of the array. You
use a subscript to specify which array value to replace.
Replacing an array value with a new array puts no restriction on the number of values within the new array. There
need not be the same number of values in the new array as in the existing one.
For instance, suppose that the employee with id 102 wishes to add another favorite book to his
list in the favorite_books table. This is achieved with an
UPDATE statement in Example 7-26 which completely
overwrites the previous value.
Example 7-26. Completely modifying an array
booktown=# UPDATE favorite_books
booktown-# SET books='{"The Hitchhiker\'s Guide to the Galaxy",
booktown'# "The Restaurant at the End of the Universe"}'
booktown-# WHERE employee_id = 102;
UPDATE 1
The same approach used in Example 7-26 can be used to set a slice of an array by
attaching a slice descriptor to the end of the target identifier (e.g., books[1:3] would
refer to the first, second and third values in the books array column). More commonly,
though, situations arise where a single value within an array needs to be modified, instead of the entire array, or a slice
of an array.
Updating a single value in an array is done by attaching a subscript to the target identifier to indicate the
specific value to be modified. Example 7-27 updates the first array value of the
books column, in the favorite_books table.
Example 7-27. Modifying an array subscript
booktown=# SELECT books[1] FROM favorite_books;
books
------------------------------------------
The Hitchhiker's Guide to the Galaxy
The Hobbit
(2 rows)
booktown=# UPDATE favorite_books
booktown-# SET books[1] = 'There and Back Again: A Hobbit\'s Holiday'
booktown-# WHERE books[1] = 'The Hobbit';
UPDATE 1
booktown=# SELECT books[1] FROM favorite_books;
books
------------------------------------------
The Hitchhiker's Guide to the Galaxy
There and Back Again: A Hobbit's Holiday
(2 rows)