In order to address these problems, Vinum implements a four-level hierarchy of
objects:
The most visible object is the virtual disk, called a volume. Volumes have essentially the same properties as a
UNIX® disk drive, though there are some minor
differences. They have no size limitations.
Volumes are composed of plexes,
each of which represent the total address space of a volume. This level in the hierarchy
thus provides redundancy. Think of plexes as individual disks in a mirrored array, each
containing the same data.
Since Vinum exists within the UNIX disk storage
framework, it would be possible to use UNIX partitions as
the building block for multi-disk plexes, but in fact this turns out to be too
inflexible: UNIX disks can have only a limited number of
partitions. Instead, Vinum subdivides a single UNIX
partition (the drive) into
contiguous areas called subdisks,
which it uses as building blocks for plexes.
Subdisks reside on Vinum drives,
currently UNIX partitions. Vinum drives can contain any
number of subdisks. With the exception of a small area at the beginning of the drive,
which is used for storing configuration and state information, the entire drive is
available for data storage.
The following sections describe the way these objects provide the functionality
required of Vinum.
Plexes can include multiple subdisks spread over all drives in the Vinum
configuration. As a result, the size of an individual drive does not limit the size of a
plex, and thus of a volume.
Vinum implements mirroring by attaching multiple plexes to a volume. Each plex is a
representation of the data in a volume. A volume may contain between one and eight
plexes.
Although a plex represents the complete data of a volume, it is possible for parts of
the representation to be physically missing, either by design (by not defining a subdisk
for parts of the plex) or by accident (as a result of the failure of a drive). As long as
at least one plex can provide the data for the complete address range of the volume, the
volume is fully functional.
Vinum implements both concatenation and striping at the plex level:
A concatenated plex uses the
address space of each subdisk in turn.
A striped plex stripes the data
across each subdisk. The subdisks must all have the same size, and there must be at least
two subdisks in order to distinguish it from a concatenated plex.
The version of Vinum supplied with FreeBSD 7.0 implements two kinds of plex:
Concatenated plexes are the most flexible: they can contain any number of subdisks,
and the subdisks may be of different length. The plex may be extended by adding
additional subdisks. They require less CPU time than
striped plexes, though the difference in CPU overhead
is not measurable. On the other hand, they are most susceptible to hot spots, where one
disk is very active and others are idle.
The greatest advantage of striped (RAID-0) plexes
is that they reduce hot spots: by choosing an optimum sized stripe (about 256 kB),
you can even out the load on the component drives. The disadvantages of this approach are
(fractionally) more complex code and restrictions on subdisks: they must be all the same
size, and extending a plex by adding new subdisks is so complicated that Vinum currently
does not implement it. Vinum imposes an additional, trivial restriction: a striped plex
must have at least two subdisks, since otherwise it is indistinguishable from a
concatenated plex.
Table 20-1 summarizes the advantages
and disadvantages of each plex organization.
Table 20-1. Vinum Plex Organizations
Plex type
Minimum subdisks
Can add subdisks
Must be equal size
Application
concatenated
1
yes
no
Large data storage with maximum placement flexibility and moderate performance
striped
2
no
yes
High performance in combination with highly concurrent access