Although everything presented in this chapter so far has dealt only
with single hard drives directly-attached to a system, there are other,
more advanced options that you can explore. The following sections
describe some of the more common approaches to expanding your mass
storage options.
Combining network and mass storage technologies can result in a
great deal more flexibility for system administrators. There are two
benefits that are possible with this type of configuration:
Consolidation of storage
Simplified administration
Storage can be consolidated by deploying high-performance servers
with high-speed network connectivity and configured with large amounts
of fast storage. Given an appropriate configuration, it is possible
to provide storage access at speeds comparable to locally-attached
storage. Furthermore, the shared nature of such a configuration often
makes it possible to reduce costs, as the expenses associated with
providing centralized, shared storage can be less than providing the
equivalent storage for each and every client. In addition, free space
is consolidated, instead of being spread out (and not widely usable)
across many clients.
Centralized storage servers can also make many administrative
tasks easier. For instance, monitoring free space is much easier when
the storage to be monitored exists on a centralized storage server.
Backups can be vastly simplified using a centralized storage server.
Network-aware backups for multiple clients are possible, but require
more work to configure and maintain.
There are a number of different networked storage technologies
available; choosing one can be difficult. Nearly every operating
system on the market today includes some means of accessing
network-accessible storage, but the different technologies are
incompatible with each other. What is the best approach to
determining which technology to deploy?
The approach that usually provides the best results is to let the
built-in capabilities of the client decide the issue. There are a
number of reasons for this:
Minimal client integration issues
Minimal work on each client system
Low per-client cost of entry
Keep in mind that any client-related issues are multiplied by the
number of clients in your organization. By using the clients'
built-in capabilities, you have no additional software to install on
each client (incurring zero additional cost in software
procurement). And you have the best chance for good support and
integration with the client operating system.
There is a downside, however. This means that the server
environment must be up to the task of providing good support for the
network-accessible storage technologies required by the clients. In
cases where the server and client operating systems are one and the
same, there is normally no issue. Otherwise, it will be necessary to
invest time and effort in making the server "speak" the clients'
language. However, often this trade-off is more than justified.
One skill that a system administrator should cultivate is the
ability to look at complex system configurations, and observe the
different shortcomings inherent in each configuration. While this
might, at first glance, seem to be a rather depressing viewpoint to
take, it can be a great way to look beyond the shiny new boxes and
visualize some future Saturday night with all production down due to a
failure that could easily have been avoided with a bit of
forethought.
With this in mind, let us use what we now know about disk-based
storage and see if we can determine the ways that disk drives can
cause problems. First, consider an outright hardware failure:
A disk drive with four partitions on it dies completely:
what happens to the data on those partitions?
It is immediately unavailable (at least until the failing unit can
be replaced, and the data restored from a recent backup).
A disk drive with a single partition on it is operating at the
limits of its design due to massive I/O loads: what happens to
applications that require access to the data on that
partition?
The applications slow down because the disk drive cannot process
reads and writes any faster.
You have a large data file that is slowly growing in size; soon it
will be larger than the largest disk drive available for your system.
What happens then?
The disk drive fills up, the data file stops growing, and its
associated applications stop running.
Just one of these problems could cripple a data center, yet system
administrators must face these kinds of issues every day. What can be
done?
Fortunately, there is one technology that can address each one of
these issues. The name for that technology is
RAID.
RAID is an acronym standing for Redundant Array of Independent
Disks[1]. As the name implies, RAID is a way
for multiple disk drives to act as if they were a single disk
drive.
RAID techniques were first developed by researchers at the
University of California, Berkeley in the mid-1980s. At the time,
there was a large gap in price between the high-performance disk
drives used on the large computer installations of the day, and the
smaller, slower disk drives used by the still-young personal
computer industry. RAID was viewed as a method of having several
less expensive disk drives fill in for one higher-priced
unit.
More importantly, RAID arrays can be constructed in different
ways, resulting in different characteristics depending on the final
configuration. Let us look at the different configurations (known
as RAID levels) in more detail.
The Berkeley researchers originally defined five different
RAID levels and numbered them "1" through "5." In time,
additional RAID levels were defined by other researchers and
members of the storage industry. Not all RAID levels were equally
useful; some were of interest only for research purposes, and
others could not be economically implemented.
In the end, there were three RAID levels that ended up seeing
widespread usage:
Level 0
Level 1
Level 5
The following sections discuss each of these levels in more
detail.
The disk configuration known as RAID level 0 is a bit
misleading, as this is the only RAID level that employs
absolutely no redundancy. However, even though RAID 0 has no
advantages from a reliability standpoint, it does have other
benefits.
A RAID 0 array consists of two or more disk drives. The
available storage capacity on each drive is divided into
chunks, which represent some multiple of
the drives' native block size. Data written to the array is be
written, chunk by chunk, to each drive in the array. The chunks
can be thought of as forming stripes across each drive in the
array; hence the other term for RAID 0:
striping.
For example, with a two-drive array and a 4KB chunk size,
writing 12KB of data to the array would result in the data being
written in three 4KB chunks to the following drives:
The first 4KB would be written to the first drive, into
the first chunk
The second 4KB would be written to the second drive,
into the first chunk
The last 4KB would be written to the first drive, into
the second chunk
Compared to a single disk drive, the advantages to RAID 0
include:
Larger total size — RAID 0 arrays can be
constructed that are larger than a single disk drive, making
it easier to store larger data files
Better read/write performance — The I/O load on a
RAID 0 array is spread evenly among all the drives in the
array (Assuming all the I/O is not concentrated on a single
chunk)
No wasted space — All available storage on all
drives in the array are available for data storage
Compared to a single disk drive, RAID 0 has the following
disadvantage:
Less reliability — Every drive in a RAID 0 array
must be operative for the array to be available; a single
drive failure in an N-drive RAID
0 array results in the removal of
1/Nth of all the data, rendering
the array useless
Tip
If you have trouble keeping the different RAID levels
straight, just remember that RAID 0 has
zero percent redundancy.
RAID 1 uses two (although some implementations support more)
identical disk drives. All data is written to both drives,
making them mirror images of each other. That is why RAID 1 is
often known as mirroring.
Whenever data is written to a RAID 1 array, two physical
writes must take place: one to the first drive, and one to the
second drive. Reading data, on the other hand, only needs to
take place once and either drive in the array can be
used.
Compared to a single disk drive, a RAID 1 array has the
following advantages:
Improved redundancy — Even if one drive in the
array were to fail, the data would still be
accessible
Improved read performance — With both drives
operational, reads can be evenly split between them,
reducing per-drive I/O loads
When compared to a single disk drive, a RAID 1 array has
some disadvantages:
Maximum array size is limited to the largest single
drive available.
Reduced write performance — Because both drives
must be kept up-to-date, all write I/Os must be performed by
both drives, slowing the overall process of writing data to
the array
Reduced cost efficiency — With one entire drive
dedicated to redundancy, the cost of a RAID 1 array is at
least double that of a single drive
Tip
If you have trouble keeping the different RAID levels
straight, just remember that RAID 1 has
one hundred percent redundancy.
RAID 5 attempts to combine the benefits of RAID 0 and RAID
1, while minimizing their respective disadvantages.
Like RAID 0, a RAID 5 array consists of multiple disk
drives, each divided into chunks. This allows a RAID 5 array to
be larger than any single drive. Like a RAID 1 array, a RAID 5
array uses some disk space in a redundant fashion, improving
reliability.
However, the way RAID 5 works is unlike either RAID 0 or
1.
A RAID 5 array must consist of at least three
identically-sized disk drives (although more drives may be
used). Each drive is divided into chunks and data is written to
the chunks in order. However, not every chunk is dedicated to
data storage as it is in RAID 0. Instead, in an array with
n disk drives in it, every
nth chunk is dedicated to
parity.
Chunks containing parity make it possible to recover data
should one of the drives in the array fail. The parity in chunk
x is calculated by mathematically
combining the data from each chunk x
stored on all the other drives in the array. If the data in a
chunk is updated, the corresponding parity chunk must be
recalculated and updated as well.
This also means that every time data is written to the
array, at least two drives are written to:
the drive holding the data, and the drive containing the parity
chunk.
One key point to keep in mind is that the parity chunks are
not concentrated on any one drive in the array. Instead, they
are spread evenly across all the drives. Even though dedicating
a specific drive to contain nothing but parity is possible (in
fact, this configuration is known as RAID level 4), the constant
updating of parity as data is written to the array would mean
that the parity drive could become a performance bottleneck. By
spreading the parity information evenly throughout the array,
this impact is reduced.
However, it is important to keep in mind the impact of
parity on the overall storage capacity of the array. Even
though the parity information is spread evenly across all the
drives in the array, the amount of available storage is reduced
by the size of one drive.
Compared to a single drive, a RAID 5 array has the following
advantages:
Improved redundancy — If one drive in the array
fails, the parity information can be used to reconstruct the
missing data chunks, all while keeping the array available
for use[2]
Improved read performance — Due to the RAID 0-like
way data is divided between drives in the array, read I/O
activity is spread evenly between all the drives
Reasonably good cost efficiency — For a RAID 5
array of n drives, only
1/nth of the total available
storage is dedicated to redundancy
Compared to a single drive, a RAID 5 array has the following
disadvantage:
Reduced write performance — Because each write to
the array results in at least two writes to the physical
drives (one write for the data and one for the parity),
write performance is worse than a single
drive[3]
As should be obvious from the discussion of the various RAID
levels, each level has specific strengths and weaknesses. It
was not long after RAID-based storage began to be deployed that
people began to wonder whether different RAID levels could
somehow be combined, producing arrays with all of the strengths
and none of the weaknesses of the original levels.
For example, what if the disk drives in a RAID 0 array were
themselves actually RAID 1 arrays? This would give the
advantages of RAID 0's speed, with the reliability of RAID
1.
This is just the kind of thing that can be done. Here are
the most commonly-nested RAID levels:
RAID 1+0
RAID 5+0
RAID 5+1
Because nested RAID is used in more specialized
environments, we will not go into greater detail here. However,
there are two points to keep in mind when thinking about nested
RAID:
Order matters — The order in which RAID levels are
nested can have a large impact on reliability. In other
words, RAID 1+0 and RAID 0+1 are not
the same.
Costs can be high — If there is any disadvantage
common to all nested RAID implementations, it is one of
cost; for example, the smallest possible RAID 5+1 array
consists of six disk drives (and even more drives are
required for larger arrays).
Now that we have explored the concepts behind RAID, let us
see how RAID can be implemented.
It is obvious from the previous sections that RAID requires
additional "intelligence" over and above the usual disk I/O
processing for individual drives. At the very least, the
following tasks must be performed:
Dividing incoming I/O requests to the individual disks in
the array
For RAID 5, calculating parity and writing it to the
appropriate drive in the array
Monitoring the individual disks in the array and taking
the appropriate action should one fail
Controlling the rebuilding of an individual disk in the
array, when that disk has been replaced or repaired
Providing a means to allow administrators to maintain the
array (removing and adding drives, initiating and halting
rebuilds, etc.)
There are two major methods that may be used to accomplish
these tasks. The next two sections describe them in more
detail.
A hardware RAID implementation usually takes the form of a
specialized disk controller card. The card performs all
RAID-related functions and directly controls the individual
drives in the arrays attached to it. With the proper driver,
the arrays managed by a hardware RAID card appear to the host
operating system just as if they were regular disk
drives.
Most RAID controller cards work with SCSI drives, although
there are some ATA-based RAID controllers as well. In any case,
the administrative interface is usually implemented in one of
three ways:
Specialized utility programs that run as applications
under the host operating system, presenting a software
interface to the controller card
An on-board interface using a serial port that is
accessed using a terminal emulator
A BIOS-like interface that is only accessible during the
system's power-up testing
Some RAID controllers have more than one type of
administrative interface available. For obvious reasons, a
software interface provides the most flexibility, as it allows
administrative functions while the operating system is running.
However, if you are booting an operating system from a RAID
controller, an interface that does not require a running
operating system is a requirement.
Because there are so many different RAID controller cards on
the market, it is impossible to go into further detail here.
The best course of action is to read the manufacturer's
documentation for more information.
Software RAID is RAID implemented as kernel- or driver-level
software for a particular operating system. As such, it
provides more flexibility in terms of hardware support —
as long as the hardware is supported by the operating system,
RAID arrays can be configured and deployed. This can
dramatically reduce the cost of deploying RAID by eliminating
the need for expensive, specialized RAID hardware.
Often the excess CPU power available for software RAID
parity calculations greatly exceeds the processing power present
on a RAID controller card. Therefore, some software RAID
implementations actually have the capability for higher
performance than hardware RAID implementations.
However, software RAID does have limitations not present in
hardware RAID. The most important one to consider is support
for booting from a software RAID array. In most cases, only
RAID 1 arrays can be used for booting, as the computer's BIOS is
not RAID-aware. Since a single drive from a RAID 1 array is
indistinguishable from a non-RAID boot device, the BIOS can
successfully start the boot process; the operating system can
then change over to software RAID operation once it has gained
control of the system.
One other advanced storage technology is that of
logical volume management (LVM). LVM makes it
possible to treat physical mass storage devices as low-level building
blocks on which different storage configurations are built. The exact
capabilities vary according to the specific implementation, but can
include physical storage grouping, logical volume resizing, and data
migration.
Although the name given to this capability may differ, physical
storage grouping is the foundation for all LVM implementations. As
the name implies, the physical mass storage devices can be grouped
together in such a way as to create one or more logical mass storage
devices. The logical mass storage devices (or logical volumes) can
be larger in capacity than the capacity of any one of the underlying
physical mass storage devices.
For example, given two 100GB drives, a 200GB logical volume can
be created. However, a 150GB and a 50GB logical volume could also
be created. Any combination of logical volumes equal to or less
than the total capacity (200GB in this example) is possible. The
choices are limited only by your organization's needs.
This makes it possible for a system administrator to treat all
storage as being part of a single pool, available for use in any
amount. In addition, drives can be added to the pool at a later
time, making it a straightforward process to stay ahead of your
users' demand for storage.
The feature that most system administrators appreciate about LVM
is its ability to easily direct storage where it is needed. In a
non-LVM system configuration, running out of space means — at
best — moving files from the full device to one with available
space. Often it can mean actual reconfiguration of your system's
mass storage devices; a task that would have to take place after
normal business hours.
However, LVM makes it possible to easily increase the size of a
logical volume. Assume for a moment that our 200GB storage pool was
used to create a 150GB logical volume, with the remaining 50GB held
in reserve. If the 150GB logical volume became full, LVM makes it
possible to increase its size (say, by 10GB) without any physical
reconfiguration. Depending on the operating system environment, it
may be possible to do this dynamically or it might require a short
amount of downtime to actually perform the resizing.
Most seasoned system administrators would be impressed by LVM
capabilities so far, but they would also be asking themselves this
question:
What happens if one of the drives making up a logical volume
starts to fail?
The good news is that most LVM implementations include the
ability to migrate data off of a particular
physical drive. For this to work, there must be sufficient reserve
capacity left to absorb the loss of the failing drive. Once the
migration is complete, the failing drive can then be replaced and
added back into the available storage pool.
Given that LVM has some features similar to RAID (the ability to
dynamically replace failing drives, for instance), and some features
providing capabilities that cannot be matched by most RAID
implementations (such as the ability to dynamically add more storage
to a central storage pool), many people wonder whether RAID is no
longer important.
Nothing could be further from the truth. RAID and LVM are
complementary technologies that can be used together (in a manner
similar to nested RAID levels), making it possible to get the best
of both worlds.
When early RAID research began, the acronym
stood for Redundant Array of Inexpensive Disks,
but over time the "standalone" disks that RAID was intended to
supplant became cheaper and cheaper, rendering the price comparison
meaningless.
There is also an impact from the parity
calculations required for each write. However, depending on
the specific RAID 5 implementation (specifically, where in
the system the parity calculations are performed), this
impact can range from sizable to nearly
nonexistent.