Indices are database objects that can greatly increase database performance, enabling faster
execution of statements involving comparative criteria. An index tracks the data on one or more columns in a table, allowing
conditional clauses (such as the WHERE clause) to find their targeted rows more
efficiently.
The internal workings of indices vary, and there are several implementations to choose from. This section describes the
different types of indices available, and explains when you should use one type over the other.
While indices exist to enhance performance, they also contribute to system overhead. Indices must be updated as data
in the column that they are applied to fluctuates. Maintaining infrequently used indices decreases performance when the amount of time spent maintaining them outweighs the time saved through using them. In general, indices
should be applied only to columns that you expect to use frequently in comparative expressions.
Indices are created with the CREATE INDEX SQL command. Here is the syntax for
CREATE INDEX:
CREATE [ UNIQUE ] INDEX indexname ON table
[ USING indextype ] ( column [ opclass ] [, ...] )
In this syntax, indexname is the name of the new index to be created, table
is the name of the table to be indexed, and column is the name of a specific column to be indexed.
Optionally, the indextype parameter may be set to specify index implementation, and
the opclass parameter may be set to indicate what operator class should be used in sorting input
values.
Warning |
Operator classes are stored in PostgreSQL's pg_opclass column. Unless you are
especially knowledgeable of the technical inner workings of PostgreSQL's operator classes, you shouldn't use this option. |
Regarding the column to index, multiple names may be specified, separated by commas; doing so
creates an index across both specified columns. Such an index will only be used by PostgreSQL when
executing SQL statements that search all indexed columns in the WHERE clause through the
AND keyword. Multicolumn indices are limited to a maximum of 16 columns in a default
PostgreSQL installation, and may not use an index type other than B-tree.
In determining upon which columns to create an index, consider which columns will be most frequently accessed for search
conditions. For example, while the books table keeps an index on its
id column (the primary key), the title column is also
frequently checked for in WHERE conditions. Adding a secondary index to the
title column will greatly increase the performance of SQL statements making comparisons
against values in that column.
Example 7-1 shows how to create such an index and uses the \d
psql slash command to view the books table. In addition to showing
the table's column types, this command also shows the indices that have been created on it.
Example 7-1. Creating an index
booktown=# CREATE INDEX books_title_idx
booktown-# ON books (title);
CREATE
booktown=# \d books
Table "books"
Attribute | Type | Modifier
------------+---------+----------
id | integer | not null
title | text | not null
author_id | integer |
subject_id | integer |
Indices: books_id_pkey,
books_title_idx
Certain types of table constraints, notably the PRIMARY KEY and
UNIQUE constraints, result in the implicit creation of an index for use in enforcing the
constraint. In Example 7-2 you see the creation of the Book Town
authors table with the PRIMARY KEY constraint placed on
its id column. This use of PRIMARY KEY causes an index
called authors_ pkey to be implicitly created.
Example 7-2. Implicit index creation
booktown=# CREATE TABLE authors (id integer PRIMARY KEY,
booktown(# last_name text,
booktown(# first_name text);
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'authors_pkey' for
table 'authors'
CREATE
example=# \d authors
Table "authors"
Attribute | Type | Modifier
------------+---------+----------
id | integer | not null
last_name | text |
first_name | text |
Index: authors_pkey
The index created in Example 7-2 enables PostgreSQL to quickly verify that the primary key
is unique for any new row inserted into the table, as well as enhances the performance of queries which use the
id column as a search condition.
Specifying the UNIQUE keyword causes the index to disallow duplicate values within
the column (or set of columns) it indexes. Creating a UNIQUE index on a table is functionally identical to
creating a table with the UNIQUE constraint (see the Section called Using Constraints" later in
this chapter).
Example 7-3 creates a unique index called
unique_ publisher_idx on the publishers table's
name column. This will disallow two publishers from having the same name in this
table.
Example 7-3. Creating a unique index
booktown=# CREATE UNIQUE INDEX unique_publisher_idx
booktown-# ON publishers (name);
CREATE
booktown=# \d publishers
Table "publishers"
Attribute | Type | Modifier
-----------+---------+----------
id | integer | not null
name | text |
address | text |
Indices: publishers_pkey,
unique_publisher_idx
As the NULL value does not technically match any value, duplicate instances of
NULL can be inserted into a column with a unique index placed on it. This is the main
practical difference between a unique index and an index implicitly created by the
PRIMARY KEY constraint, which does not allow NULL
values at all.
Warning |
Note that the UNIQUE clause may not be used in conjunction with the
USING clause for any index type other than B-tree. |
The optional USING clause can be used to specify the type of index to
implement. PostgreSQL 7.1.x supports three types of indices including:
PostgreSQL's B-tree implementation uses Lehman-Yao high-concurrency B-tree algorithms and is both the most
capable, and most commonly used indexing method. For this reason, it is the default index type.
The R-tree implementation is primarily useful for spacial data type operations (i.e., operations on geometric
types) and utilizes Guttman's quadratic split algorithm. The Hash implementation utilizes Litwin's linear
hashing routines, which have been traditionally used for indices that involve frequent direct equal-to comparisons (e.g.,
with the = operator).
At the time of this writing, for PostgreSQL 7.1.x, the B-tree index implementation is by far the most capable and
flexible of the available index types. At this time, it is recommended that you use the B-tree index type over the Hash
implementation, even for direct = comparisons. The Hash index exists mostly for legacy
reasons, though it may still be used if you are sure your system would benefit from a Hash index over a B-tree
index.
As stated, the R-tree index implementation is recommended for indexing geometric types; be aware, however, that
several limitations exist. For example, you cannot create a unique R-tree index, nor can you
create an R-tree index on more than one column. In these cases, it makes more sense to rely on the capable B-tree index
type.
The USING clause can be used with the keywords
BTREE, RTREE, and HASH
in order to explicitly choose the type of index you want to create. Specifying BTREE
explicitly chooses the default index type.
Example 7-4 creates a table called polygons, which
stores spacial data of the type polygon. An index named
spacial_idx is then applied to its shape column with
the R-tree implementation.
Example 7-4. Specifying an index type
booktown=# CREATE TABLE polygons (shape polygon);
CREATE
booktown=# CREATE INDEX spacial_idx ON polygons USING RTREE (shape);
CREATE
Warning |
Again, unless you have a solid conceptual understanding of why one index type might be preferable in your system
over another, we advise you to use the default B-tree type. |
A slightly modified form of the CREATE INDEX command can be used to index
the results of a function on a column value, rather than the column value itself. This is called a
functional index.
Use the following syntax to create a functional index:
CREATE [ UNIQUE ] INDEX indexname ON table
[ USING indextype ] ( functionname ( column [, ...] ) [ opclass ] )
The only difference in this syntax is that the index is created on the results of the specified function
applied to each column value. All other clauses have the same effect as the functional index.
Functional indices are useful on table columns that commonly have their values prepared through a function before
being compared against values in a SQL statement. For example, the upper() function is
commonly used to make case-insensitive comparisons. Creating an index using the upper()
function enables such case-insensitive comparisons to be carried out efficiently.
Example 7-5 creates a functional index named
upper_title_idx on the books table. It uses the
upper() function on the title column as the basis to
create the index. Then it performs an example SQL query that is more efficiently executed as a result of the newly created
functional index.
Example 7-5. Creating a functional index
booktown=# CREATE INDEX upper_title_idx ON books
booktown-# (upper(title));
CREATE
booktown=# SELECT title FROM books WHERE upper(title) = 'DUNE';
title
-------
Dune
(1 row)