An important consideration to the use of the
pg_dump
,
pg_dumpall
, and
pg_restore
commands is
when
to use them, and when not to. Fortunately, in respect
to each of these procedures, PostgreSQL is quite accommodating.
With regards to backing up data with either
pg_dump
or
pg_dumpall
, there
are few considerations necessary for when they may be performed. PostgreSQL has supported
hot backup
procedures since Version 6.5—these allow you to request data without blocking the normal activity of other concurrent
users. It is called a hot backup because it is performed while the system is running, uninterrupted.
Therefore, the only potential considerations for backup PostgreSQL center around performance. An
exceptionally large database may take a while to dump all of its contents. The use of large objects may also be a factor
if you intend to back up large object data with
pg_dump
's
-b
flag (thus, adding
to the amount of data needing to be output).
If you have a large database that takes a substantial amount of time to complete a dump, it is recommended that you
schedule the
pg_dump
execution for a time when the database is not heavily used. Even though a dump
does not block users from requesting and completing transactions, it can still slow down the general performance of such
a system during heavy, or even medium usage.
With respect to restoration there are several more considerations to be taken into account than when merely backing
up data. Specifically, these apply to how "deep" a restoration must go; restoring just the data is a very different
operation from totally recreating the database from scratch, and the restrictions involved scale with the depth of the
operation.
The least restrictive kind of restoration is one which restores data
only
. This can be
executed while users are actively connected to the database. It may even be executed while connected users are in the
middle of transactions. This is possible through PostgreSQL's multiversion control. Such a restore can be performed on
the fly, without having to restart the database system. Once modifications are synchronized with the database, the
changes are immediately available to connected users.
A restoration involving dropping and recreating database schema (e.g., tables, views, etc.) may also be performed
while the system is running. This method is not as seamless as a data-only restoration, because database objects will
briefly be removed from the system, which may cause temporary problems to applications relying on certain objects to
exist. The exact nature of such a restriction is dependent on the nature of the application accessing the
database.
The most restrictive kind of restoration is one which involves dropping the database itself. If such a restoration
is scheduled, it
must
be done at a time when no other user is connected to that database. The
DROP DATABASE command will fail if any user is actively connected at the time it is
executed.
It may in fact be necessary to shut down and restart PostgreSQL with TCP/IP connections disabled if a highly-used
database is intended to be dropped and recreated from scratch; this will prevent any external machine from connecting to
the database server until the work is completed.