We all know that backups must be scheduled periodically. A
full backups (a snapshot of the data at a point in time) can
be done in MySQL with several tools. For example,
InnoDB Hot Backup
provides online
non-blocking physical backup of the InnoDB
data files, and mysqldump provides online
logical backup. This discussion uses
mysqldump.
Assume that we make a backup on Sunday at 1 p.m., when load is
low. The following command makes a full backup of all our
InnoDB
tables in all databases:
shell> mysqldump --single-transaction --all-databases > backup_sunday_1_PM.sql
This is an online, non-blocking backup that does not disturb
the reads and writes on the tables. We assumed earlier that
our tables are InnoDB
tables, so
--single-transaction
uses a consistent read
and guarantees that data seen by mysqldump
does not change. (Changes made by other clients to
InnoDB
tables are not seen by the
mysqldump process.) If we do also have
other types of tables, we must assume that they are not
changed during the backup. For example, for the
MyISAM
tables in the
mysql
database, we must assume that no
administrative changes are being made to MySQL accounts during
the backup.
The resulting .sql
file produced by
mysqldump contains a set of SQL
INSERT
statements that can be used to
reload the dumped tables at a later time.
Full backups are necessary, but they are not always
convenient. They produce large backup files and take time to
generate. They are not optimal in the sense that each
successive full backup includes all data, even that part that
has not changed since the previous full backup. After we have
made the initial full backup, it is more efficient to make
incremental backups. They are smaller and take less time to
produce. The tradeoff is that, at recovery time, you cannot
restore your data just by reloading the full backup. You must
also process the incremental backups to recover the
incremental changes.
To make incremental backups, we need to save the incremental
changes. The MySQL server should always be started with the
--log-bin
option so that it stores these
changes in a file while it updates data. This option enables
binary logging, so that the server writes each SQL statement
that updates data into a file called a MySQL binary log.
Looking at the data directory of a MySQL server that was
started with the --log-bin
option and that
has been running for some days, we find these MySQL binary log
files:
-rw-rw---- 1 guilhem guilhem 1277324 Nov 10 23:59 gbichot2-bin.000001
-rw-rw---- 1 guilhem guilhem 4 Nov 10 23:59 gbichot2-bin.000002
-rw-rw---- 1 guilhem guilhem 79 Nov 11 11:06 gbichot2-bin.000003
-rw-rw---- 1 guilhem guilhem 508 Nov 11 11:08 gbichot2-bin.000004
-rw-rw---- 1 guilhem guilhem 220047446 Nov 12 16:47 gbichot2-bin.000005
-rw-rw---- 1 guilhem guilhem 998412 Nov 14 10:08 gbichot2-bin.000006
-rw-rw---- 1 guilhem guilhem 361 Nov 14 10:07 gbichot2-bin.index
Each time it restarts, the MySQL server creates a new binary
log file using the next number in the sequence. While the
server is running, you can also tell it to close the current
binary log file and begin a new one manually by issuing a
FLUSH LOGS
SQL statement or with a
mysqladmin flush-logs command.
mysqldump also has an option to flush the
logs. The .index
file in the data directory
contains the list of all MySQL binary logs in the directory.
This file is used for replication.
The MySQL binary logs are important for recovery because they
form the set of incremental backups. If you make sure to flush
the logs when you make your full backup, then any binary log
files created afterward contain all the data changes made
since the backup. Let's modify the previous
mysqldump command a bit so that it flushes
the MySQL binary logs at the moment of the full backup, and so
that the dump file contains the name of the new current binary
log:
shell> mysqldump --single-transaction --flush-logs --master-data=2 \
--all-databases > backup_sunday_1_PM.sql
After executing this command, the data directory contains a
new binary log file, gbichot2-bin.000007
.
The resulting .sql
file includes these
lines:
-- Position to start replication or point-in-time recovery from
-- CHANGE MASTER TO MASTER_LOG_FILE='gbichot2-bin.000007',MASTER_LOG_POS=4;
Because the mysqldump command made a full
backup, those lines mean two things:
The .sql
file contains all changes
made before any changes written to the
gbichot2-bin.000007
binary log file
or newer.
All data changes logged after the backup are not present
in the .sql
, but are present in the
gbichot2-bin.000007
binary log file
or newer.
On Monday at 1 p.m., we can create an incremental backup by
flushing the logs to begin a new binary log file. For example,
executing a mysqladmin flush-logs command
creates gbichot2-bin.000008
. All changes
between the Sunday 1 p.m. full backup and Monday 1 p.m. will
be in the gbichot2-bin.000007
file. This
incremental backup is important, so it is a good idea to copy
it to a safe place. (For example, back it up on tape or DVD,
or copy it to another machine.) On Tuesday at 1 p.m., execute
another mysqladmin flush-logs command. All
changes between Monday 1 p.m. and Tuesday 1 p.m. will be in
the gbichot2-bin.000008
file (which also
should be copied somewhere safe).
The MySQL binary logs take up disk space. To free up space,
purge them from time to time. One way to do this is by
deleting the binary logs that are no longer needed, such as
when we make a full backup:
shell> mysqldump --single-transaction --flush-logs --master-data=2 \
--all-databases --delete-master-logs > backup_sunday_1_PM.sql
Note: Deleting the MySQL
binary logs with mysqldump
--delete-master-logs can be dangerous if your server
is a replication master server, because slave servers might
not yet fully have processed the contents of the binary log.
The description for the PURGE MASTER LOGS
statement explains what should be verified before deleting the
MySQL binary logs. See Section 13.6.1.1, “PURGE MASTER LOGS
Syntax”.