5.2. Hard disks
This subsection introduces terminology related to hard
disks. If you already know the terms and concepts, you can skip
this subsection.
See Figure 5-1 for a schematic picture
of the important parts in a hard disk. A hard disk consists of one
or more circular aluminum platters\
,
of which either or both surfaces are coated
with a magnetic substance used for recording the data. For each
surface, there is a read-write head that
examines or alters the recorded data. The platters rotate on a
common axis; typical rotation speed is 5400 or 7200 rotations per
minute, although high-performance hard disks have higher speeds and
older disks may have lower speeds. The heads move along the radius
of the platters; this movement combined with the rotation of the
platters allows the head to access all parts of the surfaces.
The processor (CPU) and the actual disk communicate through a
disk controller
. This relieves the rest of
the computer from knowing how to use the drive, since the
controllers for different types of disks can be made to use the same
interface towards the rest of the computer. Therefore, the computer
can say just ``hey disk, give me what I want'', instead of a long
and complex series of electric signals to move the head to the
proper location and waiting for the correct position to come under
the head and doing all the other unpleasant stuff necessary. (In
reality, the interface to the controller is still complex, but much
less so than it would otherwise be.) The controller may also do
other things, such as caching, or automatic bad sector
replacement.
The above is usually all one needs to understand about the
hardware. There are also other things, such as the motor that
rotates the platters and moves the heads, and the electronics that
control the operation of the mechanical parts, but they are mostly
not relevant for understanding the working principles of a hard
disk.
The surfaces are usually divided into concentric rings,
called tracks, and these in turn are divided
into sectors. This division is used to
specify locations on the hard disk and to allocate disk space to
files. To find a given place on the hard disk, one might say
``surface 3, track 5, sector 7''. Usually the number of sectors is
the same for all tracks, but some hard disks put more sectors in
outer tracks (all sectors are of the same physical size, so more of
them fit in the longer outer tracks). Typically, a sector will hold
512 bytes of data. The disk itself
can't handle smaller amounts of data than one sector.
Each surface is divided into tracks (and sectors) in
the same way. This means that when the head for one surface is on a
track, the heads for the other surfaces are also on the
corresponding tracks. All the corresponding tracks taken together
are called a cylinder. It takes time to
move the heads from one track (cylinder) to another, so by placing
the data that is often accessed together (say, a file) so that it is
within one cylinder, it is not necessary to move the heads to read
all of it. This improves performance. It is not always possible to
place files like this; files that are stored in several places on
the disk are called
fragmented.
The number of surfaces (or heads, which is the same thing),
cylinders, and sectors vary a lot; the specification of the number
of each is called the geometry of a hard
disk. The geometry is usually stored in a special, battery-powered
memory location called the CMOS RAM
, from
where the operating system can fetch it during bootup or driver
initialization.
Unfortunately, the BIOS
has a design limitation, which makes it impossible to specify a
track number that is larger than 1024 in the CMOS RAM, which is too
little for a large hard disk. To overcome this, the hard disk
controller lies about the geometry, and translates the
addresses given by the computer into something that fits
reality. For example, a hard disk might have 8 heads, 2048 tracks,
and 35 sectors per track.
Its controller could lie to the computer and claim that it has 16
heads, 1024 tracks, and 35 sectors per track, thus not exceeding the
limit on tracks, and translates the address that the computer gives
it by halving the head number, and doubling the track number. The
mathematics can be more complicated in reality, because the numbers
are not as nice as here (but again, the details are not relevant for
understanding the principle). This translation distorts the
operating system's view of how the disk is organized, thus making it
impractical to use the all-data-on-one-cylinder trick to boost
performance.
The translation is only a problem for IDE disks. SCSI disks
use a sequential sector number (i.e., the controller translates a
sequential sector number to a head, cylinder, and sector triplet),
and a completely different method for the CPU to talk with the
controller, so they are insulated from the problem. Note, however,
that the computer might not know the real geometry of an SCSI disk
either.
Since Linux often will not know the real geometry of a disk,
its filesystems don't even try to keep files within a single
cylinder. Instead, it tries to assign sequentially numbered sectors
to files, which almost always gives similar performance. The issue
is further complicated by on-controller caches, and automatic
prefetches done by the controller.
Each hard disk is represented by a separate device
file. There can (usually) be only two or four IDE hard disks. These
are known as /dev/hda,
/dev/hdb,
/dev/hdc, and
/dev/hdd,
respectively. SCSI hard disks are
known as /dev/sda,
/dev/sdb, and so on. Similar naming
conventions exist for other hard disk types; see Chapter 4 for more information. Note that the device
files for the hard disks give access to the entire disk, with no
regard to partitions (which will be discussed below), and it's easy
to mess up the partitions or the data in them if you aren't careful.
The disks' device files are usually used only to get access to the
master boot record (which will also be discussed below).