Once a mass storage device is in place, there is little that it can
be used for. True, data can be written to it and read back from it, but
without any underlying structure data access is only possible by using
sector addresses (either geometrical or logical).
What is needed are methods of making the raw storage a hard drive
provides more easily usable. The following sections explore some
commonly-used techniques for doing just that.
The first thing that often strikes a system administrator is that
the size of a hard drive may be much larger than necessary for the
task at hand. As a result, many operating systems have the capability
of dividing a hard drive's space into various
partitions or
slices.
Because they are separate from each other, partitions can have
different amounts of space utilized, and that space in no way impacts
the space utilized by other partitions. For example, the partition
holding the files comprising the operating system is not affected even
if the partition holding the users' files becomes full. The operating
system still has free space for its own use.
Although it is somewhat simplistic, you can think of partitions as
being similar to individual disk drives. In fact, some operating
systems actually refer to partitions as "drives". However, this
viewpoint is not entirely accurate; therefore, it is important that we
look at partitions more closely.
A partition's geometry refers to its physical placement on a
disk drive. The geometry can be specified in terms of starting
and ending cylinders, heads, and sectors, although most often
partitions start and end on cylinder boundaries. A partition's
size is then defined as the amount of storage between the starting
and ending cylinders.
Extended partitions were developed in response to the need
for more than four partitions per disk drive. An extended
partition can itself contain multiple partitions, greatly
extending the number of partitions possible on a single drive.
The introduction of extended partitions was driven by the
ever-increasing capacities of new disk drives.
Logical partitions are those partitions contained within an
extended partition; in terms of use they are no different than a
non-extended primary partition.
Each partition has a type field that contains a code
indicating the partition's anticipated usage. The type field may
or may not reflect the computer's operating system. Instead, it
may reflect how data is to be stored within the partition. The
following section contains more information on this important
point.
Even with the proper mass storage device, properly configured, and
appropriately partitioned, we would still be unable to store and
retrieve information easily — we are missing a way of
structuring and organizing that information. What we need is a
file system.
The concept of a file system is so fundamental to the use of mass
storage devices that the average computer user often does not even
make the distinction between the two. However, system administrators
cannot afford to ignore file systems and their impact on day-to-day
work.
A file system is a method of representing data on a mass storage
device. File systems usually include the following features:
File-based data storage
Hierarchical directory (sometimes known as "folder")
structure
Tracking of file creation, access, and modification
times
Some level of control over the type of access allowed for a
specific file
Some concept of file ownership
Accounting of space utilized
Not all file systems posses every one of these features. For
example, a file system constructed for a single-user operating system
could easily use a more simplified method of access control and could
conceivably do away with support for file ownership altogether.
One point to keep in mind is that the file system used can have a
large impact on the nature of your daily workload. By ensuring that
the file system you use in your organization closely matches your
organization's functional requirements, you can ensure that not only
is the file system up to the task, but that it is more easily and
efficiently maintainable.
With this in mind, the following sections explore these features
in more detail.
While file systems that use the file metaphor for data storage
are so nearly universal as to be considered a given, there are still
some aspects that should be considered here.
First is to be aware of any restrictions on file names. For
instance, what characters are permitted in a file name? What is the
maximum file name length? These questions are important, as it
dictates those file names that can be used and those that cannot.
Older operating systems with more primitive file systems often
allowed only alphanumeric characters (and only uppercase at that),
and only traditional 8.3 file names (meaning
an eight-character file name, followed by a three-character file
extension).
While the file systems used in some very old operating systems
did not include the concept of directories, all commonly-used file
systems today include this feature. Directories are themselves
usually implemented as files, meaning that no special utilities are
required to maintain them.
Furthermore, because directories are themselves files, and
directories contain files, directories can therefore contain other
directories, making a multi-level directory hierarchy possible.
This is a powerful concept with which all system administrators
should be thoroughly familiar. Using multi-level directory
hierarchies can make file management much easer for you and for your
users.
Most file systems keep track of the time at which a file was
created; some also track modification and access times. Over and
above the convenience of being able to determine when a given file
was created, accessed, or modified, these dates are vital for the
proper operation of incremental backups.
More information on how backups make use of these file system
features can be found in Section 8.2 Backups.
Access control is one area where file systems differ
dramatically. Some file systems have no clear-cut access control
model, while others are much more sophisticated. In general terms,
most modern day file systems combine two components into a cohesive
access control methodology:
User identification
Permitted action list
User identification means that the file system (and the underlying
operating system) must first be capable of uniquely identifying
individual users. This makes it possible to have full accountability
with respect to any operations on the file system level. Another
often-helpful feature is that of user groups
— creating ad-hoc collections of users. Groups are most often
used by organizations where users may be members of one or more
projects. Another feature that some file systems support is the
creation of generic identifiers that can be assigned to one or more
users.
Next, the file system must be capable of maintaining lists of
actions that are permitted (or not permitted) against each file.
The most commonly-tracked actions are:
Reading the file
Writing the file
Executing the file
Various file systems may extend the list to include other
actions such as deleting, or even the ability to make changes
related to a file's access control.
One constant in a system administrator's life is that there is
never enough free space, and even if there is, it will not remain
free for long. Therefore, a system administrator should at least be
able to easily determine the level of free space available for each
file system. In addition, file systems with well-defined user
identification capabilities often include the capability to display
the amount of space a particular user has consumed.
This feature is vital in large multi-user environments, as it is
an unfortunate fact of life that the 80/20 rule often applies to
disk space — 20 percent of your users will be responsible for
consuming 80 percent of your available disk space. By making it
easy to determine which users are in that 20 percent, you can more
effectively manage your storage-related assets.
Taking this a step further, some file systems include the
ability to set per-user limits (often known as disk
quotas) on the amount of disk space that can be
consumed. The specifics vary from file system to file system, but
in general each user can be assigned a specific amount of storage
that a user can use. Beyond that, various file systems differ.
Some file systems permit the user to exceed their limit for one time
only, while others implement a "grace period" during which a second,
higher limit is applied.
Many system administrators give little thought to how the storage
they make available to users today is actually going to be used
tomorrow. However, a bit of thought spent on this matter before
handing over the storage to users can save a great deal of unnecessary
effort later on.
The main thing that system administrators can do is to use
directories and subdirectories to structure the storage available in
an understandable way. There are several benefits to this
approach:
More easily understood
More flexibility in the future
By enforcing some level of structure on your storage, it can be
more easily understood. For example, consider a large mult-user
system. Instead of placing all user directories in one large
directory, it might make sense to use subdirectories that mirror your
organization's structure. In this way, people that work in accounting
have their directories under a directory named
accounting, people that work in engineering would
have their directories under engineering, and so
on.
The benefits of such an approach are that it would be easier on a
day-to-day basis to keep track of the storage needs (and usage) for
each part of your organization. Obtaining a listing of the files used
by everyone in human resources is straightforward. Backing up all the
files used by the legal department is easy.
With the appropriate structure, flexibility is increased. To
continue using the previous example, assume for a moment that the
engineering department is due to take on several large new projects.
Because of this, many new engineers are to be hired in the near
future. However, there is currently not enough free storage available
to support the expected additions to engineering.
However, since every person in engineering has their files stored
under the engineering directory, it would be a
straightforward process to:
Procure the additional storage necessary to support
engineering
Back up everything under the engineering
directory
Restore the backup onto the new storage
Rename the engineering directory on the
original storage to something like
engineering-archive (before deleting it
entirely after running smoothly with the new configuration for a
month)
Make the necessary changes so that all engineering personnel
can access their files on the new storage
Of course, such an approach does have its shortcomings. For
example, if people frequently move between departments, you must have
a way of being informed of such transfers, and you must modify the
directory structure appropriately. Otherwise, the structure no longer
reflects reality, which makes more work — not less — for
you in the long run.
Once a mass storage device has been properly partitioned, and a
file system written to it, the storage is available for general
use.
For some operating systems, this is true; as soon as the operating
system detects the new mass storage device, it can be formatted by the
system administrator and may be accessed immediately with no
additional effort.
Other operating systems require an additional step. This step
— often referred to as mounting —
directs the operating system as to how the storage may be
accessed. Mounting storage normally is done via a special utility
program or command, and requires that the mass storage device (and
possibly the partition as well) be explicitly identified.