Database maintenance is a broad subject. This section covers the physical maintenance of the system (pertaining to its
disk usage), analytical maintenance (to increase performance), and database object documentation (to add to the
maintainability and clarity of the schema).
The primary tool for physical and analytical database maintenance in PostgreSQL is the
VACUUM SQL command, and its accompanying command-line script, vacuumdb.
They each perform the same two general functions:
Remove any leftover data from rollbacks and other processes that can leave temporary data
Analyze activity in the database to assist PostgreSQL in designing efficient queries
It is good practice to perform a VACUUM nightly on a production database. While it can
be run at the same time data is accessed, doing so will decrease the response time of the server. As such, it is generally preferable to
schedule it at a time when you do not expect a great deal of database activity.
Any time an exceptionally large number of records are added or deleted, it is prudent to perform a
VACUUM to analyze the database, which automatically updates the PostgreSQL query optimizer
of major changes to the tables. By doing this you allow PostgreSQL to have a more up-to-date profile of the data
within the database, providing a better set of information with which to plan the most efficient queries. All of these
actions should result in a faster, more efficient response from the database.
Warning
The VACUUM command locks tables in access exclusive mode. This
means that any query involving a table being vacuumed will pause and wait until the vacuum of the affected table is
complete before continuing.
Used without any of the optional keywords or identifiers, a VACUUM statement will
clean up each table in the presently connected database, one at a time, deleting temporary data and recovering disk space.
This use of VACUUM is primarily to maximize free disk space.
An optional table identifier may be specified if you want the
VACUUM to clean a single table in the connected database, rather than all tables.
It will also update statistics in the system catalogs pertaining to the number of records and amount of data stored in each table.
Example 9-14 shows the use of a VACUUM statement in the
booktown database, on the books table.
Example 9-14. Using VACUUM on a table
booktown=# VACUUM books;
VACUUM
The VACUUM message returned in Example 9-14 indicates that
the process finished successfully. If a specified table cannot be found, you will instead receive the following
notice:
NOTICE: Vacuum: table not found
With the use of the optional ANALYZE keyword, PostgreSQL examines the allocation of
data in each column for each table (or the specified table, if provided), and uses the information to prepare the
query optimizer for better planning. With the use of the ANALYZE keyword, you also have the
option to analyze only specified columns. Example 9-15 shows the use of the
VACUUM ANALYZE command on the entire booktown database.
Example 9-15. Using VACUUM ANALYZE on a database
booktown=# VACUUM ANALYZE;
VACUUM
Finally, the optional VERBOSE keyword may be applied if you are interested in seeing
a detailed internal report of the findings of the VACUUM statement. This is most likely
not of interest to anyone not actively developing the PostgreSQL engine, or related software.
As with many of the database management SQL commands, the VACUUM command has a
command-line executable wrapper called vacuumdb. The vacuumdb script provides one
significant added function to the normal use of the VACUUM SQL statement, in that you can
instruct it to perform a VACUUM on each PostgreSQL database on your system.
Additionally, since it accepts connection parameters on how to connect to PostgreSQL, you may use
vacuumdb remotely (i.e., without having to first connect to the machine via a terminal client, and
then executing vacuumdb or psql from the remote machine). This is provided that
your authentication scheme in PostgreSQL's pg_hba.conf file is configured for outside access (see
Chapter 8 for more information on this).
Here is the syntax for vacuumdb:
vacuumdb [ options ] [ dbname ]
Like the createdb and dropdb scripts, vacuumdb accepts
both single-dashed and GNU-style double-dashed arguments from the command line. The only required option is the
dbname (unless you specify - -all), which describes the database to be cleaned and
analyzed. The options parameters describe which mode the VACUUM
command should be invoked in. The following are the available options for the vacuumdb script:
-h HOSTNAME , - -host=HOSTNAME
Specifies that you are connected to HOSTNAME, rather than the localhost. Use this option when vacuuming a remote database.
-p PORT , - -port=PORT
Specifies that the database connection is made on port PORT, rather than the
default port (usually 5432, though it may have been configured differently when PostgreSQL was
compiled, by the - -with-pgport flag).
-U USERNAME , - -username=USERNAME
Specifies that the username USERNAME is the user who connects to PostgreSQL
(rather than the name of the system user executing vacuumdb).
-W, - -password
Accepts no parameters, and causes a password prompt, which occurs automatically if the
pg_hba.conf file on the target server is configured not to trust the requesting host.
-d DBNAME , - -dbname=DBNAME
Explicitly specifies the name of the database to perform the VACUUM
statement on. This option is mutually exclusive to the - -all option.
-a, - -all
Applies the VACUUM command, with specified options, to all databases
in the system catalog.
-z, - -analyze
Equivalent to the ANALYZE keyword for the
VACUUM SQL command. Updates stored statistics about the data allocation between
columns, which are used by the query optimizer to help guide internal query planning.
Targets a specific table TABLE (or specific columns within that table) to be affected. The
- -analyze option is required to describe specific columns.
-v, - -verbose
Equivalent to the VERBOSE keyword for the VACUUM
SQL command. Causes a detailed internal report of the processing performed to be displayed.
-e, - -echo
Accepts no parameters, and causes the query sent to PostgreSQL to be displayed to the screen as it is
executed by vacuumdb.
-q, - -quiet
Accepts no parameters, and causes no output to be sent to stdout (though any errors will
still be sent to stderr).
Example 9-16 demonstrates the use of the vacuumdb script from
the Book Town server. The -U flag specifies that the connection should use the
manager user to connect, while the - -all flag causes all databases in the system
catalog to be affected in sequence.
As mentioned, because of the connectivity options available to the vacuumdb script, it can be
easily executed from a remote server. Example 9-17 shows a similar process to the
command used in Example 9-16, but with the addition of the -h flag to
specify a remote server named booktown.commandprompt.com. Example 9-17 also targets the booktown database specifically, rather than
all databases.
PostgreSQL offers a non-standard SQL command called COMMENT, which allows for
documentation of any database object. By using COMMENT on a table, function, operator, or
other database object, you can provide description that is stored in the pg_description system
table. Descriptions can be easily retrieved through a set of extended psql slash commands.
Most standard objects in the database have a default description, which can be perused (along with any user-added
descriptions) with the \dd slash command within psql.
Here is the syntax for COMMENT:
COMMENT ON [ [ DATABASE | INDEX | RULE | SEQUENCE | TABLE | TYPE | VIEW ]
{ object_name |
COLUMN table_name.column_name |
AGGREGATE aggregate_name aggregate_type |
FUNCTION function_name ( argument_type [, ...] ) |
OPERATOR operator_name ( leftoperand_type , rightoperand_type ) |
TRIGGER trigger_name ON table_name }
] IS ' description '
In this syntax, object_name is the name of the database object that you wish to add a comment of
description to. The keywords for the major database objects are optional, but if you intend to place a
comment on a column, function, aggregate function, operator, or trigger, you must specify the preceding keyword so that
PostgreSQL knows what kind of syntax to expect, and where to look for the object name specified.
Note: Note that any comment added to a database is tied to both the database
in which it is added, and the user who added it. You can only see those comments that you create.
The description string constant following the IS keyword is the
literal comment to be placed in the database. Example 9-18 demonstrates placing a simple
description on the id column of the booktown database.
Example 9-18. Commenting the books table
booktown=# COMMENT ON COLUMN books.id
booktown-# IS 'An Internal Book Town Identifier';
COMMENT
The COMMENT server message returned in Example 9-18
indicates that the comment was successfully placed on the column.
You may retrieve comments from the database easily by using the psql slash-plus commands.
These are as follows:
\d+
Displays the same information as the standard \d command (displaying all tables,
views, sequences, and indices in the currently connected database), but adds a column for the comments as well.
\l+
Displays comments on all databases.
\df+ [ pattern ]
Displays descriptions for each function in the currently connected database (as well as the language and source
of the function). You may wish to view this slash command in expanded mode for readability by first initiating
the \x slash command (see Chapter 6 for more about this).
You can optionally supply a regular expression pattern to compare against existing function
names, allowing you to limit the number of functions displayed.
\dt+
Displays comments on all tables in the currently connected database.
\di+
Displays comments on all indices in the currently connected database.
\ds+
Displays comments on all sequences in the currently connected database.
\dv+
Displays comments on all views in the currently connected database.
\dS+
Displays comments on system tables. Note that comments placed on system tables are still tied to a particular database,
even though they are accessible from all databases, and will therefore not be visible when performing a
\dS+ slash command from another database connection.
\dd
Displays all descriptions for all database objects.
Example 9-19 demonstrates the retrieval of the comment that was placed on the
books table in Example 9-18 with the use of the
psql\d+ command.
Example 9-19. Retrieving a comment
booktown=# \d+ books
Table "books"
Attribute | Type | Modifier | Description
------------+---------+----------+----------------------------------
id | integer | not null | An Internal Book Town Identifier
title | text | not null |
author_id | integer | |
subject_id | integer | |
Index: books_id_pkey
The COMMENT SQL command provides a very simple way of internally
documenting your objects, from tables to functions. This can be of great help when working with large or complicated
database schema. Even the best of naming conventions do not always result in database objects whose applications are self-evident;
this is especially the case when working with multiple developers.