For a machine that has fully-mirrored filesystems using Vinum, it is desirable to also
mirror the root filesystem. Setting up such a configuration is less trivial than
mirroring an arbitrary filesystem because:
The root filesystem must be available very early during the boot process, so the Vinum
infrastructure must already be available at this time.
The volume containing the root filesystem also contains the system bootstrap and the
kernel, which must be read using the host system's native utilities (e. g. the BIOS on
PC-class machines) which often cannot be taught about the details of Vinum.
In the following sections, the term “root volume” is generally used to
describe the Vinum volume that contains the root filesystem. It is probably a good idea
to use the name "root" for this volume, but this is not
technically required in any way. All command examples in the following sections assume
this name though.
There are several measures to take for this to happen:
Vinum must be available in the kernel at boot-time. Thus, the method to start Vinum
automatically described in Section
20.8.1.1 is not applicable to accomplish this task, and the start_vinum parameter must actually not be set when the following setup is being arranged. The
first option would be to compile Vinum statically into the kernel, so it is available all
the time, but this is usually not desirable. There is another option as well, to have /boot/loader (Section
12.3.3) load the vinum kernel module early, before starting the kernel. This can be
accomplished by putting the line:
geom_vinum_load="YES"
into the file /boot/loader.conf.
Note: For Gvinum, all
startup is done automatically once the kernel module has been loaded, so the procedure
described above is all that is needed. The following text documents the behaviour of the
historic Vinum system, for the sake of older setups.
Vinum must be initialized early since it needs to supply the volume for the root
filesystem. By default, the Vinum kernel part is not looking for drives that might
contain Vinum volume information until the administrator (or one of the startup scripts)
issues a vinum start command.
Note: The following paragraphs are outlining the steps needed for FreeBSD.
By placing the line:
vinum.autostart="YES"
into /boot/loader.conf, Vinum is instructed to automatically
scan all drives for Vinum information as part of the kernel startup.
Note that it is not necessary to instruct the kernel where to look for the root
filesystem. /boot/loader looks up the name of the root device
in /etc/fstab, and passes this information on to the kernel.
When it comes to mount the root filesystem, the kernel figures out from the device name
provided which driver to ask to translate this into the internal device ID (major/minor
number).
Since the current FreeBSD bootstrap is only 7.5 KB of code, and already has the burden
of reading files (like /boot/loader) from the UFS filesystem,
it is sheer impossible to also teach it about internal Vinum structures so it could parse
the Vinum configuration data, and figure out about the elements of a boot volume itself.
Thus, some tricks are necessary to provide the bootstrap code with the illusion of a
standard "a" partition that contains the root filesystem.
For this to be possible at all, the following requirements must be met for the root
volume:
The root volume must not be striped or RAID-5.
The root volume must not contain more than one concatenated subdisk per plex.
Note that it is desirable and possible that there are multiple plexes, each containing
one replica of the root filesystem. The bootstrap process will, however, only use one of
these replica for finding the bootstrap and all the files, until the kernel will
eventually mount the root filesystem itself. Each single subdisk within these plexes will
then need its own "a" partition illusion, for the respective
device to become bootable. It is not strictly needed that each of these faked "a" partitions is located at the same offset within its device,
compared with other devices containing plexes of the root volume. However, it is probably
a good idea to create the Vinum volumes that way so the resulting mirrored devices are
symmetric, to avoid confusion.
In order to set up these "a" partitions, for each device
containing part of the root volume, the following needs to be done:
The location (offset from the beginning of the device) and size of this device's
subdisk that is part of the root volume need to be examined, using the command:
#gvinum l -rv root
Note that Vinum offsets and sizes are measured in bytes. They must be divided by 512
in order to obtain the block numbers that are to be used in the bsdlabel command.
Run the command:
#bsdlabel -e devname
for each device that participates in the root volume. devname must be either the name of the disk (like da0) for disks without a slice (aka. fdisk) table, or the name of
the slice (like ad0s1).
If there is already an "a" partition on the device
(presumably, containing a pre-Vinum root filesystem), it should be renamed to something
else, so it remains accessible (just in case), but will no longer be used by default to
bootstrap the system. Note that active partitions (like a root filesystem currently
mounted) cannot be renamed, so this must be executed either when being booted from a
“Fixit” medium, or in a two-step process, where (in a mirrored situation) the
disk that has not been currently booted is being manipulated first.
Then, the offset of the Vinum partition on this device (if any) must be added to the
offset of the respective root volume subdisk on this device. The resulting value will
become the "offset" value for the new "a" partition. The "size" value for this
partition can be taken verbatim from the calculation above. The "fstype" should be 4.2BSD. The "fsize", "bsize", and "cpg" values should best be chosen to match the actual filesystem,
though they are fairly unimportant within this context.
That way, a new "a" partition will be established that
overlaps the Vinum partition on this device. Note that the bsdlabel will only allow for this overlap if the Vinum partition has
properly been marked using the "vinum" fstype.
That's all! A faked "a" partition does exist now on each
device that has one replica of the root volume. It is highly recommendable to verify the
result again, using a command like:
#fsck -n /dev/devnamea
It should be remembered that all files containing control information must be relative
to the root filesystem in the Vinum volume which, when setting up a new Vinum root
volume, might not match the root filesystem that is currently active. So in particular,
the files /etc/fstab and /boot/loader.conf need to be taken care of.
At next reboot, the bootstrap should figure out the appropriate control information
from the new Vinum-based root filesystem, and act accordingly. At the end of the kernel
initialization process, after all devices have been announced, the prominent notice that
shows the success of this setup is a message like:
After the Vinum root volume has been set up, the output of gvinum
l -rv root could look like:
...
Subdisk root.p0.s0:
Size: 125829120 bytes (120 MB)
State: up
Plex root.p0 at offset 0 (0 B)
Drive disk0 (/dev/da0h) at offset 135680 (132 kB)
Subdisk root.p1.s0:
Size: 125829120 bytes (120 MB)
State: up
Plex root.p1 at offset 0 (0 B)
Drive disk1 (/dev/da1h) at offset 135680 (132 kB)
The values to note are 135680 for the offset (relative to
partition /dev/da0h). This translates to 265 512-byte disk
blocks in bsdlabel's terms. Likewise, the size of this root
volume is 245760 512-byte blocks. /dev/da1h, containing the
second replica of this root volume, has a symmetric setup.
It can be observed that the "size" parameter for the faked
"a" partition matches the value outlined above, while the "offset" parameter is the sum of the offset within the Vinum
partition "h", and the offset of this partition within the
device (or slice). This is a typical setup that is necessary to avoid the problem
described in Section 20.9.4.3. It can also
be seen that the entire "a" partition is completely within the
"h" partition containing all the Vinum data for this device.
Note that in the above example, the entire device is dedicated to Vinum, and there is
no leftover pre-Vinum root partition, since this has been a newly set-up disk that was
only meant to be part of a Vinum configuration, ever.
If for any reason the system does not continue to boot, the bootstrap can be
interrupted with by pressing the space key at the 10-seconds
warning. The loader variables (like vinum.autostart) can be
examined using the show, and manipulated using set or unset commands.
If the only problem was that the Vinum kernel module was not yet in the list of
modules to load automatically, a simple load geom_vinum will
help.
When ready, the boot process can be continued with a boot
-as. The options -as will request the kernel to ask for
the root filesystem to mount (-a), and make the boot process
stop in single-user mode (-s), where the root filesystem is
mounted read-only. That way, even if only one plex of a multi-plex volume has been
mounted, no data inconsistency between plexes is being risked.
At the prompt asking for a root filesystem to mount, any device that contains a valid
root filesystem can be entered. If /etc/fstab had been set up
correctly, the default should be something like ufs:/dev/gvinum/root. A typical alternate choice would be something
like ufs:da0d which could be a hypothetical partition that
contains the pre-Vinum root filesystem. Care should be taken if one of the alias "a" partitions are entered here that are actually reference to the
subdisks of the Vinum root device, because in a mirrored setup, this would only mount one
piece of a mirrored root device. If this filesystem is to be mounted read-write later on,
it is necessary to remove the other plex(es) of the Vinum root volume since these plexes
would otherwise carry inconsistent data.
If /boot/loader fails to load, but the primary bootstrap
still loads (visible by a single dash in the left column of the screen right after the
boot process starts), an attempt can be made to interrupt the primary bootstrap at this
point, using the space key. This will make the bootstrap stop in
stage two, see Section 12.3.2. An attempt can
be made here to boot off an alternate partition, like the partition containing the
previous root filesystem that has been moved away from "a"
above.
This situation will happen if the bootstrap had been destroyed by the Vinum
installation. Unfortunately, Vinum accidentally currently leaves only 4 KB at the
beginning of its partition free before starting to write its Vinum header information.
However, the stage one and two bootstraps plus the bsdlabel embedded between them
currently require 8 KB. So if a Vinum partition was started at offset 0 within a slice or
disk that was meant to be bootable, the Vinum setup will trash the bootstrap.
Similarly, if the above situation has been recovered, for example by booting from a
“Fixit” medium, and the bootstrap has been re-installed using bsdlabel -B as described in Section 12.3.2, the bootstrap will trash the Vinum
header, and Vinum will no longer find its disk(s). Though no actual Vinum configuration
data or data in Vinum volumes will be trashed by this, and it would be possible to
recover all the data by entering exact the same Vinum configuration data again, the
situation is hard to fix at all. It would be necessary to move the entire Vinum partition
by at least 4 KB off, in order to have the Vinum header and the system bootstrap no
longer collide.