To permit wholesale restoration of entire file systems
The first purpose is the basis for the typical file restoration
request: a user accidentally deletes a file and asks that it be restored
from the latest backup. The exact circumstances may vary somewhat, but
this is the most common day-to-day use for backups.
The second situation is a system administrator's worst nightmare:
for whatever reason, the system administrator is staring at hardware
that used to be a productive part of the data center. Now, it is little
more than a lifeless chunk of steel and silicon. The thing that is
missing is all the software and data you and your users have assembled
over the years. Supposedly everything has been backed up. The question
is: has it?
Look at the kinds of data[1] processed and stored by a typical
computer system. Notice that some of the data hardly ever
changes, and some of the data is constantly changing.
The pace at which data changes is crucial to the design of a
backup procedure. There are two reasons for this:
A backup is nothing more than a snapshot of the data being
backed up. It is a reflection of that data at a particular moment
in time.
Data that changes infrequently can be backed up infrequently,
while data that changes often must be backed up more
frequently.
System administrators that have a good understanding of their
systems, users, and applications should be able to quickly group the
data on their systems into different categories. However, here are
some examples to get you started:
Operating System
This data normally only changes during upgrades, the
installation of bug fixes, and any site-specific
modifications.
Tip
Should you even bother with operating system backups?
This is a question that many system administrators have
pondered over the years. On the one hand, if the installation
process is relatively easy, and if the application of
bugfixes and customizations are well documented and easily
reproducible, reinstalling the operating system may be a
viable option.
On the other hand, if there is the least doubt that a
fresh installation can completely recreate the original system
environment, backing up the operating system is the best
choice, even if the backups are performed much less frequently
than the backups for production data. Occasional operating
system backups also come in handy when only a few system files
must be restored (for example, due to accidental file
deletion).
Application Software
This data changes whenever applications are installed,
upgraded, or removed.
Application Data
This data changes as frequently as the associated
applications are run. Depending on the specific application and
your organization, this could mean that changes take place
second-by-second or once at the end of each fiscal year.
User Data
This data changes according to the usage patterns of your
user community. In most organizations, this means that changes
take place all the time.
Based on these categories (and any additional ones that are
specific to your organization), you should have a pretty good idea
concerning the nature of the backups that are needed to protect your
data.
Note
You should keep in mind that most backup software deals with
data on a directory or file system level. In other words, your
system's directory structure plays a part in how backups will be
performed. This is another reason why it is always a good idea to
carefully consider the best directory structure for a new system and
group files and directories according to their anticipated
usage.
In order to perform backups, it is first necessary to have the
proper software. This software must not only be able to perform the
basic task of making copies of bits onto backup media, it must also
interface cleanly with your organization's personnel and business
needs. Some of the features to consider when reviewing backup
software include:
Schedules backups to run at the proper time
Manages the location, rotation, and usage of backup
media
Works with operators (and/or robotic media changers) to ensure
that the proper media is available
Assists operators in locating the media containing a specific
backup of a given file
As you can see, a real-world backup solution entails much more
than just scribbling bits onto your backup media.
Most system administrators at this point look at one of two
solutions:
Purchase a commercially-developed solution
Create an in-house developed backup system from scratch
(possibly integrating one or more open source technologies)
Each approach has its good and bad points. Given the complexity
of the task, an in-house solution is not likely to handle some aspects
(such as media management, or have comprehensive documentation and
technical support) very well. However, for some organizations, this
might not be a shortcoming.
A commercially-developed solution is more likely to be highly
functional, but may also be overly-complex for the organization's
present needs. That said, the complexity might make it possible to
stick with one solution even as the organization grows.
As you can see, there is no clear-cut method for deciding on a
backup system. The only guidance that can be offered is to ask you to
consider these points:
Changing backup software is difficult; once implemented, you
will be using the backup software for a long time. After all, you
will have long-term archive backups that you must be able to read.
Changing backup software means you must either keep the original
software around (to access the archive backups), or you must
convert your archive backups to be compatible with the new
software.
Depending on the backup software, the effort involved in
converting archive backups may be as straightforward (though
time-consuming) as running the backups through an already-existing
conversion program, or it may require reverse-engineering the
backup format and writing custom software to perform the
task.
The software must be 100% reliable — it must back up
what it is supposed to, when it is supposed to.
When the time comes to restore any data — whether a
single file or an entire file system — the backup software
must be 100% reliable.
If you were to ask a person that was not familiar with computer
backups, most would think that a backup was just an identical copy of
all the data on the computer. In other words, if
a backup was created Tuesday evening, and nothing changed on the
computer all day Wednesday, the backup created Wednesday evening would
be identical to the one created on Tuesday.
While it is possible to configure backups in this way, it is
likely that you would not. To understand more about this, we must
first understand the different types of backups that can be created.
They are:
The type of backup that was discussed at the beginning of this
section is known as a full backup. A full
backup is a backup where every single file is written to the backup
media. As noted above, if the data being backed up never changes,
every full backup being created will be the same.
That similarity is due to the fact that a full backup does not
check to see if a file has changed since the last backup; it blindly
writes everything to the backup media whether it has been modified
or not.
This is the reason why full backups are not done all the time
— every file is written to the backup media. This means that
a great deal of backup media is used even if nothing has changed.
Backing up 100 gigabytes of data each night when maybe 10 megabytes
worth of data has changed is not a sound approach; that is why
incremental backups were created.
Unlike full backups, incremental backups first look to see
whether a file's modification time is more recent than its last
backup time. If it is not, the file has not been modified since the
last backup and can be skipped this time. On the other hand, if the
modification date is more recent than the last
backup date, the file has been modified and should be backed
up.
Incremental backups are used in conjunction with a
regularly-occurring full backup (for example, a weekly full backup,
with daily incrementals).
The primary advantage gained by using incremental backups is
that the incremental backups run more quickly than full backups.
The primary disadvantage to incremental backups is that restoring
any given file may mean going through one or more incremental
backups until the file is found. When restoring a complete file
system, it is necessary to restore the last full backup and every
subsequent incremental backup.
In an attempt to alleviate the need to go through every
incremental backup, a slightly different approach was implemented.
This is known as the differential
backup.
Differential backups are similar to incremental backups in that
both backup only modified files. However, differential backups are
cumulative — in other words, with a
differential backup, once a file has been modified it continues to
be included in all subsequent differential backups (until the next,
full backup, of course).
This means that each differential backup contains all the files
modified since the last full backup, making it possible to perform a
complete restoration with only the last full backup and the last
differential backup.
Like the backup strategy used with incremental backups,
differential backups normally follow the same approach: a single
periodic full backup followed by more frequent differential
backups.
The effect of using differential backups in this way is that the
differential backups tend to grow a bit over time (assuming
different files are modified over the time between full backups).
This places differential backups somewhere between incremental
backups and full backups in terms of backup media utilization and
backup speed, while often providing faster single-file and complete
restorations (due to fewer backups to search/restore).
Given these characteristics, differential backups are worth
careful consideration.
We have been very careful to use the term "backup media"
throughout the previous sections. There is a reason for that. Most
experienced system administrators usually think about backups in terms
of reading and writing tapes, but today there are other
options.
At one time, tape devices were the only removable media devices
that could reasonably be used for backup purposes. However, this has
changed. In the following sections we look at the most popular backup
media, and review their advantages as well as their
disadvantages.
Tape was the first widely-used removable data storage medium.
It has the benefits of low media cost and reasonably-good storage
capacity. However, tape has some disadvantages — it is
subject to wear, and data access on tape is sequential in
nature.
These factors mean that it is necessary to keep track of tape
usage (retiring tapes once they have reached the end of their useful
life), and that searching for a specific file on tape can be a
lengthy proposition.
On the other hand, tape is one of the most inexpensive mass
storage media available, and it has a long history of reliability.
This means that building a good-sized tape library need not consume
a large part of your budget, and you can count on it being usable
now and in the future.
In years past, disk drives would never have been used as a
backup medium. However, storage prices have dropped to the point
where, in some cases, using disk drives for backup storage does make
sense.
The primary reason for using disk drives as a backup medium
would be speed. There is no faster mass storage medium available.
Speed can be a critical factor when your data center's backup window
is short, and the amount of data to be backed up is large.
But disk storage is not the ideal backup medium, for a number of
reasons:
Disk drives are not normally removable. One key factor to
an effective backup strategy is to get the backups out of your
data center and into off-site storage of some sort. A backup of
your production database sitting on a disk drive two feet away
from the database itself is not a backup; it is a copy. And
copies are not very useful should the data center and its
contents (including your copies) be damaged or destroyed by some
unfortunate set of circumstances.
Disk drives are expensive (at least compared to other backup
media). There may be situations where money truly is no object,
but in all other circumstances, the expenses associated with
using disk drives for backup mean that the number of backup
copies must be kept low to keep the overall cost of backups low.
Fewer backup copies mean less redundancy should a backup not be
readable for some reason.
Disk drives are fragile. Even if you spend the extra money
for removable disk drives, their fragility can be a problem. If
you drop a disk drive, you have lost your backup. It is
possible to purchase specialized cases that can reduce (but not
entirely eliminate) this hazard, but that makes an
already-expensive proposition even more so.
Disk drives are not archival media. Even assuming you are
able to overcome all the other problems associated with
performing backups onto disk drives, you should consider the
following. Most organizations have various legal requirements
for keeping records available for certain lengths of time. The
chance of getting usable data from a 20-year-old tape is much
greater than the chance of getting usable data from a
20-year-old disk drive. For instance, would you still have the
hardware necessary to connect it to your system? Another thing
to consider is that a disk drive is much more complex than a
tape cartridge. When a 20-year-old motor spins a 20-year-old
disk platter, causing 20-year-old read/write heads to fly over
the platter surface, what are the chances that all these
components will work flawlessly after sitting idle for 20
years?
Note
Some data centers back up to disk drives and then, when
the backups have been completed, the backups are written out
to tape for archival purposes. This allows for the fastest
possible backups during the backup window. Writing the
backups to tape can then take place during the remainder of
the business day; as long as the "taping" finishes before the
next day's backups are done, time is not an issue.
All this said, there are still some instances where backing up
to disk drives might make sense. In the next section we see how
they can be combined with a network to form a viable (if expensive)
backup solution.
By itself, a network cannot act as backup media. But combined
with mass storage technologies, it can serve quite well. For
instance, by combining a high-speed network link to a remote data
center containing large amounts of disk storage, suddenly the
disadvantages about backing up to disks mentioned earlier are no
longer disadvantages.
By backing up over the network, the disk drives are already
off-site, so there is no need for transporting fragile disk drives
anywhere. With sufficient network bandwidth, the speed advantage
you can get from backing up to disk drives is maintained.
However, this approach still does nothing to address the matter
of archival storage (though the same "spin off to tape after the
backup" approach mentioned earlier can be used). In addition, the
costs of a remote data center with a high-speed link to the main
data center make this solution extremely expensive. But for the
types of organizations that need the kind of features this solution
can provide, it is a cost they gladly pay.
Once the backups are complete, what happens then? The obvious
answer is that the backups must be stored. However, what is not so
obvious is exactly what should be stored — and where.
To answer these questions, we must first consider under what
circumstances the backups are to be used. There are three main
situations:
Small, ad-hoc restoration requests from users
Massive restorations to recover from a disaster
Archival storage unlikely to ever be used again
Unfortunately, there are irreconcilable differences between
numbers 1 and 2. When a user accidentally deletes a file, they would
like it back immediately. This implies that the backup media is no
more than a few steps away from the system to which the data is to be
restored.
In the case of a disaster that necessitates a complete restoration
of one or more computers in your data center, if the disaster was
physical in nature, whatever it was that destroyed your computers
would also have destroyed the backups sitting a few steps away from
the computers. This would be a very bad state of affairs.
Archival storage is less controversial; since the chances that it
will ever be used for any purpose are rather low, if the backup media
was located miles away from the data center there would be no real
problem.
The approaches taken to resolve these differences vary according
to the needs of the organization involved. One possible approach is
to store several days worth of backups on-site; these backups are then
taken to more secure off-site storage when newer daily backups are
created.
Another approach would be to maintain two different pools of
media:
A data center pool used strictly for ad-hoc restoration
requests
An off-site pool used for off-site storage and disaster
recovery
Of course, having two pools implies the need to run all backups
twice or to make a copy of the backups. This can be done, but double
backups can take too long, and copying requires multiple backup drives
to process the copies (and probably a dedicated system to actually
perform the copy).
The challenge for a system administrator is to strike a balance
that adequately meets everyone's needs, while ensuring that the
backups are available for the worst of situations.
While backups are a daily occurrence, restorations are normally a
less frequent event. However, restorations are inevitable; they will
be necessary, so it is best to be prepared.
The important thing to do is to look at the various restoration
scenarios detailed throughout this section and determine ways to test
your ability to actually carry them out. And keep in mind that the
hardest one to test is also the most critical one.
The phrase "restoring from bare metal" is a system
administrator's way of describing the process of restoring a
complete system backup onto a computer with absolutely no data of
any kind on it — no operating system, no applications,
nothing.
Overall, there are two basic approaches to bare metal
restorations:
Reinstall, followed by restore
Here the base operating system is installed just as if a
brand-new computer were being initially set up. Once the
operating system is in place and configured properly, the
remaining disk drives can be partitioned and formatted, and
all backups restored from backup media.
System recovery disks
A system recovery disk is bootable media of some kind
(often a CD-ROM) that contains a minimal system environment,
able to perform most basic system administration tasks. The
recovery environment contains the necessary utilities to
partition and format disk drives, the device drivers necessary
to access the backup device, and the software necessary to
restore data from the backup media.
Note
Some computers have the ability to create bootable backup
tapes and to actually boot from them to start the restoration
process. However, this capability is not available to all
computers. Most notably, computers based on the PC architecture do
not lend themselves to this approach.
Every type of backup should be tested on a periodic basis to
make sure that data can be read from it. It is a fact that
sometimes backups are performed that are, for one reason or another,
unreadable. The unfortunate part in all this is that many times it
is not realized until data has been lost and must be restored from
backup.
The reasons for this can range from changes in tape drive head
alignment, misconfigured backup software, and operator error. No
matter what the cause, without periodic testing you cannot be sure
that you are actually generating backups from which data can be
restored at some later time.
We are using the term
data in this section to describe anything
that is processed via backup software. This includes operating
system software, application software, as well as actual data. No
matter what it is, as far as backup software is concerned, it is
all data.