The original relational model specifies that the values represented by columns within a table be an atomic piece of
data, object-relational database systems such as PostgreSQL allow non-atomic values to be used through data
structures called
arrays
.
An array is a collection of data values referenced through a single identifier. The array may be a collection
of values of a built-in data type or a user-defined data type, but every value in the array must be of the same type.
Arrays can be accessed from a table through subscript notation via square brackets (e.g.,
my_array[0]). You can also use an array constant via curly braces within single quotes (e.g.,
'{value_one,value_two,value_three}').
When defining an array, the syntax allows for the array to be defined either as fixed-length or variable-length;
however as of PostgreSQL 7.1.2, the fixed-length size restriction is not enforced. This means that you may treat the
array as having a fixed number of elements at all times, but it can still be dynamically sized.
For example, it is perfectly acceptable for a single column defined as an array to contain three
values in one record, four values in another, and no values in a third.
Additionally, arrays may be defined as being
multi-dimensional
, meaning that each element of the
array may actually represent
another array
, rather than an atomic value. Values that are selected
from a multi-dimensional array will consist of nested curly braces in order to show an array within an array, as follows:
booktown=#
SELECT editions FROM my_notes WHERE title='The Cat in the Hat';
editions
---------------------------------------------------------------
{{"039480001X","1st Ed, Hard Cover"},{"0394900014","1st Ed"}}
(1 row)
In order to actually insert array values into a table column, you need a way to refer to several values
as an array in a SQL statement. The formal syntax of an array constant is a grouping of values, separated by delimiters
(commas, for built-in data types), enclosed by curly braces ({}), which are in turn
enclosed by single quotes, as follows:
'{
value1
,
value2
[, ...] }'
The
values
in this syntax can be any valid PostgreSQL data type. As the entire array is
constrained by single quotes, the use of single quotes
within
an array value must be escaped, just
as they must be within a string constant. The use of commas to delimit the values, however, poses an interesting problem
pertaining to the use of character strings which contain commas themselves, as the commas will be interpreted as delimiters
if not within single-quotes. However, as just mentioned, the singles quotes constrain the
array
, not
the array's
values
.
PostgreSQL's method of handling this is to use
double-quotes
to quote string constants where
single-quotes would ordinarily be used outside of an array context, as follows:
'{"value1" , "value 2, which contains a comma" }'
It's vital to remember that arrays
require
the single quotes surrounding the curly braces in
order to be interpreted correctly by PostgreSQL. You can think of array constants as being akin to a special type of
string constant, which is interpreted as an array based on where it is used (e.g., when used to add records to a target
column which is of an array data type). This is because unless used in an array context, a constant of the this format
will be interpreted by PostgreSQL as a normal string constant (as it is bound by single quotes) which just happens to
include curly braces.