Making sure there is sufficient free space available should be at
the top of every system administrator's daily task list. The reason
why regular, frequent free space checking is so important is because
free space is so dynamic; there can be more than enough space one
moment, and almost none the next.
In general, there are three reasons for insufficient free
space:
Excessive usage by a user
Excessive usage by an application
Normal growth in usage
These reasons are explored in more detail in the following
sections.
Different people have different levels of neatness. Some people
would be horrified to see a speck of dust on a table, while others
would not think twice about having a collection of last year's pizza
boxes stacked by the sofa. It is the same with storage:
Some people are very frugal in their storage usage and never
leave any unneeded files hanging around.
Some people never seem to find the time to get rid of files
that are no longer needed.
Many times where a user is responsible for using large amounts
of storage, it is the second type of person that is found to be
responsible.
This is one area in which a system administrator needs to
summon all the diplomacy and social skills they can muster. Quite
often discussions over disk space become emotional, as people view
enforcement of disk usage restrictions as making their job more
difficult (or impossible), that the restrictions are unreasonably
small, or that they just do not have the time to clean up their
files.
The best system administrators take many factors into account
in such a situation. Are the restrictions equitable and
reasonable for the type of work being done by this person? Does
the person seem to be using their disk space appropriately? Can
you help the person reduce their disk usage in some way (by
creating a backup CD-ROM of all emails over one year old, for
example)? Your job during the conversation is to attempt to
discover if this is, in fact, the case while making sure that
someone that has no real need for that much storage cleans up
their act.
In any case, the thing to do is to keep the conversation on a
professional, factual level. Try to address the user's issues in
a polite manner ("I understand you are very busy, but everyone
else in your department has the same responsibility to not waste
storage, and their average utilization is less than half of
yours.") while moving the conversation toward the matter at hand.
Be sure to offer assistance if a lack of knowledge/experience
seems to be the problem.
Approaching the situation in a sensitive but firm manner is
often better than using your authority as system administrator to
force a certain outcome. For example, you might find that
sometimes a compromise between you and the user is necessary.
This compromise can take one of three forms:
Provide temporary space
Make archival backups
Give up
You might find that the user can reduce their usage if they
have some amount of temporary space that they can use without
restriction. People that often take advantage of this situation
find that it allows them to work without worrying about space
until they get to a logical stopping point, at which time they can
perform some housekeeping, and determine what files in temporary
storage are really needed or not.
Warning
If you offer this situation to a user, do
not fall into the trap of allowing this
temporary space to become permanent space. Make it very clear
that the space being offered is temporary, and that no
guarantees can be made as to data retention; no backups of any
data in temporary space are ever made.
In fact, many administrators often underscore this fact by
automatically deleting any files in temporary storage that are
older than a certain age (a week, for example).
Other times, the user may have many files that are so
obviously old that it is unlikely continuous access to them is
needed. Make sure you determine that this is, in fact, the case.
Sometimes individual users are responsible for maintaining an
archive of old data; in these instances, you should make a point
of assisting them in that task by providing multiple backups that
are treated no differently from your data center's archival
backups.
However, there are times when the data is of dubious value.
In these instances you might find it best to offer to make a
special backup for them. You then back up the old data, and give
the user the backup media, explaining that they are responsible
for its safekeeping, and if they ever need access to any of the
data, to ask you (or your organization's operations staff —
whatever is appropriate for your organization) to restore
it.
There are a few things to keep in mind so that this does not
backfire on you. First and foremost is to not include files that
are likely to need restoring; do not select files that are
too new. Next, make sure that you are able
to perform a restoration if one ever is requested. This means
that the backup media should be of a type that you are reasonably
sure will be used in your data center for the foreseeable
future.
Tip
Your choice of backup media should also take into
consideration those technologies that can enable the user to
handle data restoration themselves. For example, even though
backing up several gigabytes onto CD-R media is more work than
issuing a single command and spinning it off to a 20GB tape
cartridge, consider that the user can then be able to access
the data on CD-R whenever they want — without ever
involving you.
Sometimes an application is responsible for excessive usage.
The reasons for this can vary, but can include:
Enhancements in the application's functionality require more
storage
An increase in the number of users using the
application
The application fails to clean up after itself, leaving
no-longer-needed temporary files on disk
The application is broken, and the bug is causing it to use
more storage than it should
Your task is to determine which of the reasons from this list
apply to your situation. Being aware of the status of the
applications used in your data center should help you eliminate
several of these reasons, as should your awareness of your users'
processing habits. What remains to be done is often a bit of
detective work into where the storage has gone. This should narrow
down the field substantially.
At this point you must then take the appropriate steps, be it
the addition of storage to support an increasingly-popular
application, contacting the application's developers to discuss its
file handling characteristics, or writing scripts to clean up after
the application.
Most organizations experience some level of growth over the long
term. Because of this, it is normal to expect storage utilization
to increase at a similar pace. In nearly all circumstances, ongoing
monitoring can reveal the average rate of storage utilization at
your organization; this rate can then be used to determine the time
at which additional storage should be procured before your free
space actually runs out.
If you are in the position of unexpectedly running out of free
space due to normal growth, you have not been doing your job.
However, sometimes large additional demands on your systems'
storage can come up unexpectedly. Your organization may have merged
with another, necessitating rapid changes in the IT infrastructure
(and therefore, storage). A new high-priority project may have
literally sprung up overnight. Changes to an existing application
may have resulted in greatly increased storage needs.
No matter what the reason, there are times when you will be
taken by surprise. To plan for these instances, try to configure your
storage architecture for maximum flexibility. Keeping spare storage
on-hand (if possible) can alleviate the impact of such unplanned
events.
Many times the first thing most people think of when they think
about disk quotas is using it to force users to keep their directories
clean. While there are sites where this may be the case, it also
helps to look at the problem of disk space usage from another
perspective. What about applications that, for one reason or another,
consume too much disk space? It is not unheard of for applications to
fail in ways that cause them to consume all available disk space. In
these cases, disk quotas can help limit the damage caused by such
errant applications, forcing it to stop before no
free space is left on the disk.
The hardest part of implementing and managing disk quotas revolves
around the limits themselves. What should they be? A simplistic
approach would be to divide the disk space by the number of users
and/or groups using it, and use the resulting number as the per-user
quota. For example, if the system has a 100GB disk drive and 20
users, each user should be given a disk quota of no more than 5GB.
That way, each user would be guaranteed 5GB (although the disk would
be 100% full at that point).
For those operating systems that support it, temporary quotas
could be set somewhat higher — say 7.5GB, with a permanent quota
remaining at 5GB. This would have the benefit of allowing users to
permanently consume no more than their percentage of the disk, but
still permitting some flexibility when a user reaches (and exceeds)
their limit. When using disk quotas in this manner, you are actually
over-committing the available disk space. The temporary quota is
7.5GB. If all 20 users exceeded their permanent quota at the same
time and attempted to approach their temporary quota, that 100GB disk
would actually have to be 150GB to allow everyone to reach their
temporary quota at the same time.
However, in practice not everyone exceeds their permanent quota at
the same time, making some amount of overcommitment a reasonable
approach. Of course, the selection of permanent and temporary quotas
is up to the system administrator, as each site and user community is
different.
Issues relating to file access typically revolve around one
scenario — a user is not able to access a file they feel they
should be able to access.
Often this is a case of user #1 wanting to give a copy of a file
to user #2. In most organizations, the ability for one user to
access another user's files is strictly curtailed, leading to this
problem.
There are three approaches that could conceivably be
taken:
User #1 makes the necessary changes to allow user #2 to
access the file wherever it currently exists.
A file exchange area is created for such purposes; user #1
places a copy of the file there, which can then be copied by
user #2.
User #1 uses email to give user #2 a copy of the
file.
There is a problem with the first approach — depending on
how access is granted, user #2 may have full access to all of user
#1's files. Worse, it might have been done in such a way as to
permit all users in your organization access to
user #1's files. Still worse, this change may not be reversed after
user #2 no longer requires access, leaving user #1's files
permanently accessible by others. Unfortunately, when users are in
charge of this type of situation, security is rarely their highest
priority.
The second approach eliminates the problem of making all of user
#1's files accessible to others. However, once the file is in the
file exchange area the file is readable (and depending on the
permissions, even writable) by all other users. This approach also
raises the possibility of the file exchange area becoming filled
with files, as users often forget to clean up after
themselves.
The third approach, while seemingly an awkward solution, may
actually be the preferable one in most cases. With the advent of
industry-standard email attachment protocols and more intelligent
email programs, sending all kinds of files via email is a mostly
foolproof operation, requiring no system administrator involvement.
Of course, there is the chance that a user will attempt to email a
1GB database file to all 150 people in the finance department, so
some amount of user education (and possibly limitations on email
attachment size) would be prudent. Still, none of these approaches
deal with the situation of two or more users needing ongoing access
to a single file. In these cases, other methods are
required.
When multiple users need to share a single copy of a file,
allowing access by making changes to file permissions is not the
best approach. It is far preferable to formalize the file's shared
status. There are several reasons for this:
Files shared out of a user's directory are vulnerable to
disappearing unexpectedly when the user either leaves the
organization or does nothing more unusual than rearranging their
files.
Maintaining shared access for more than one or two
additional users becomes difficult, leading to the longer-term
problem of unnecessary work required whenever the sharing users
change responsibilities.
Therefore, the preferred approach is to:
Have the original user relinquish direct ownership of the
file
Create a group that will own the file
Place the file in a shared directory that is owned by the
group
Make all users needing access to the file part of the
group
Of course, this approach would work equally well with multiple
files as it would with single files, and can be used to implement
shared storage for large, complex projects.
Because the need for additional disk space is never-ending, a
system administrator often needs to add disk space, while sometimes
also removing older, smaller drives. This section provides an
overview of the basic process of adding and removing storage.
Note
On many operating systems, mass storage devices are named
according to their physical connection to the system. Therefore,
adding or removing mass storage devices can result in unexpected
changes to device names. When adding or removing storage, always
make sure you review (and update, if necessary) all device name
references used by your operating system.
Before anything else can be done, the new disk drive has to be
in place and accessible. While there are many different hardware
configurations possible, the following sections go through the two
most common situations — adding an ATA or SCSI disk drive.
Even with other configurations, the basic steps outlined here
still apply.
Tip
No matter what storage hardware you use, you should always
consider the load a new disk drive adds to your computer's I/O
subsystem. In general, you should try to spread the disk I/O
load over all available channels/buses. From a performance
standpoint, this is far better than putting all disk drives on
one channel and leaving another one empty and idle.
ATA disk drives are mostly used in desktop and lower-end
server systems. Nearly all systems in these classes have
built-in ATA controllers with multiple ATA channels —
normally two or four.
Each channel can support two devices — one master, and
one slave. The two devices are connected to the channel with a
single cable. Therefore, the first step is to see which
channels have available space for an additional disk drive. One
of three situations is possible:
There is a channel with only one disk drive connected to
it
There is a channel with no disk drive connected to
it
There is no space available
The first situation is usually the easiest, as it is very
likely that the cable already in place has an unused connector
into which the new disk drive can be plugged. However, if the
cable in place only has two connectors (one for the channel and
one for the already-installed disk drive), then it is necessary
to replace the existing cable with a three-connector
model.
Before installing the new disk drive, make sure that the two
disk drives sharing the channel are appropriately configured
(one as master and one as slave).
The second situation is a bit more difficult, if only for
the reason that a cable must be procured so that it can connect
a disk drive to the channel. The new disk drive may be
configured as master or slave (although traditionally the first
disk drive on a channel is normally configured as
master).
In the third situation, there is no space left for an
additional disk drive. You must then make a decision. Do
you:
Acquire an ATA controller card, and install it
Replace one of the installed disk drives with the newer,
larger one
Adding a controller card entails checking hardware
compatibility, physical capacity, and software compatibility.
Basically, the card must be compatible with your computer's bus
slots, there must be an open slot for it, and it must be
supported by your operating system. Replacing an
installed disk drive presents a unique problem: what to do with
the data on the disk? There are a few possible
approaches:
Write the data to a backup device and restore it after
installing the new disk drive
Use your network to copy the data to another system with
sufficient free space, restoring the data after installing
the new disk drive
Use the space physically occupied by a third disk drive
by:
Temporarily removing the third disk drive
Temporarily installing the new disk drive in its
place
Copying the data to the new disk drive
Removing the old disk drive
Replacing it with the new disk drive
Reinstalling the temporarily-removed third disk
drive
Temporarily install the original disk drive and the new
disk drive in another computer, copy the data to the new
disk drive, and then install the new disk drive in the
original computer
As you can see, sometimes a bit of effort must be expended
to get the data (and the new hardware) where it needs to
go.
SCSI disk drives normally are used in higher-end
workstations and server systems. Unlike ATA-based systems, SCSI
systems may or may not have built-in SCSI controllers; some do,
while others use a separate SCSI controller card.
The capabilities of SCSI controllers (whether built-in or
not) also vary widely. It may supply a narrow or wide SCSI bus.
The bus speed may be normal, fast, ultra, utra2, or
ultra160.
If these terms are unfamiliar to you (they were discussed
briefly in Section 5.3.2.2 SCSI),
you must determine the capabilities of your hardware
configuration and select an appropriate new disk drive. The
best resource for this information would be the documentation
for your system and/or SCSI adapter.
You must then determine how many SCSI buses are available on
your system, and which ones have available space for a new disk
drive. The number of devices supported by a SCSI bus varies
according to the bus width:
Narrow (8-bit) SCSI bus — 7 devices (plus
controller)
Wide (16-bit) SCSI bus — 15 devices (plus
controller)
The first step is to see which buses have available space
for an additional disk drive. One of three situations is
possible:
There is a bus with less than the maximum number of disk
drives connected to it
There is a bus with no disk drives connected to
it
There is no space available on any bus
The first situation is usually the easiest, as it is likely
that the cable in place has an unused connector into which the
new disk drive can be plugged. However, if the cable in place
does not have an unused connector, it is necessary to replace
the existing cable with one that has at least one more
connector.
The second situation is a bit more difficult, if only for
the reason that a cable must be procured so that it can connect
a disk drive to the bus.
If there is no space left for an additional disk drive, you
must make a decision. Do you:
Acquire and install a SCSI controller card
Replace one of the installed disk drives with the new,
larger one
Adding a controller card entails checking hardware
compatibility, physical capacity, and software compatibility.
Basically, the card must be compatible with your computer's bus
slots, there must be an open slot for it, and it must be
supported by your operating system.
Replacing an installed disk drive presents a unique problem:
what to do with the data on the disk? There are a few possible
approaches:
Write the data to a backup device, and restore it after
installing the new disk drive
Use your network to copy the data to another system with
sufficient free space, and restore after installing the new
disk drive
Use the space physically occupied by a third disk drive
by:
Temporarily removing the third disk drive
Temporarily installing the new disk drive in its
place
Copying the data to the new disk drive
Removing the old disk drive
Replacing it with the new disk drive
Reinstalling the temporarily-removed third disk
drive
Temporarily install the original disk drive and the new
disk drive in another computer, copy the data to the new
disk drive, and then install the new disk drive in the
original computer
Once you have an available connector in which to plug the
new disk drive, you must make sure that the drive's SCSI ID is
set appropriately. To do this, you must know what all of the
other devices on the bus (including the controller) are using
for their SCSI IDs. The easiest way to do this is to access the
SCSI controller's BIOS. This is normally done by pressing a
specific key sequence during the system's power-up sequence.
You can then view the SCSI controller's configuration, along
with the devices attached to all of its buses.
Next, you must consider proper bus termination. When adding
a new disk drive, the rule is actually quite straightforward
— if the new disk drive is the last (or only) device on
the bus, it must have termination enabled. Otherwise,
termination must be disabled.
At this point, you can move on to the next step in the
process — partitioning your new disk drive.
Once the disk drive has been installed, it is time to create
one or more partitions to make the space available to your
operating system. Although the tools vary depending on the
operating system, the basic steps are the same:
Select the new disk drive
View the disk drive's current partition table, to ensure
that the disk drive to be partitioned is, in fact, the correct
one
Delete any unwanted partitions that may already be present
on the new disk drive
Create the new partition(s), being sure to specify the
desired size and partition type
Save your changes and exit the partitioning program
Warning
When partitioning a new disk drive, it is
vital that you are sure the disk drive you
are about to partition is the correct one. Otherwise, you may
inadvertently partition a disk drive that is already in use,
resulting in lost data.
Also make sure you have decided on the best partition size.
Always give this matter serious thought, because changing it
later is much more difficult than taking a bit of time now to
think things through.
At this point, the new disk drive has one or more partitions
that have been created. However, before the space contained
within those partitions can be used, the partitions must first be
formatted. By formatting, you are selecting a specific file
system to be used within each partition. As such, this is a
pivotal time in the life of this disk drive; the choices you make
now cannot be changed later without going through a great deal of
work.
The actual process of formatting is done by running a utility
program; the steps involved in this vary according to the
operating system. Once formatting is complete, the disk drive is
now properly configured for use.
Before continuing, it is always best to double-check your work
by accessing the partition(s) and making sure everything is in
order.
If your operating system requires any configuration changes to
use the new storage you have added, now is the time to make the
necessary changes.
At this point you can be relatively confident that the
operating system is configured properly to automatically make the
new storage accessible every time the system boots (although if
you can afford a quick reboot, it would not hurt to do so —
just to be sure).
The next section explores one of the most commonly-forgotten
steps in the process of adding new storage.
Assuming that the new storage is being used to hold data
worthy of being preserved, this is the time to make the necessary
changes to your backup procedures and ensure that the new storage
will, in fact, be backed up. The exact nature of what you must do
to make this happen depends on the way that backups are performed
on your system. However, here are some points to keep in mind
while making the necessary changes:
Consider what the optimal backup frequency should
be
Determine what backup style would be most appropriate
(full backups only, full with incrementals, full with
differentials, etc.)
Consider the impact of the additional storage on your
backup media usage, particularly as it starts to fill
up
Judge whether the additional backup could cause the
backups to take too long and start using time outside of your
alloted backup window
Make sure that these changes are communicated to the
people that need to know (other system administrators,
operations personnel, etc.)
Once all this is done, your new storage is ready for
use.
Removing disk space from a system is straightforward, with most
of the steps being similar to the installation sequence (except, of
course, in reverse):
Move any data to be saved off the disk drive
Modify the backup schedule so that the disk drive is no
longer backed up
Update the system configuration
Erase the contents of the disk drive
Remove the disk drive
As you can see, compared to the installation process, there are
a few extra steps to take. These steps are discussed in the
following sections.
Should there be any data on the disk drive that must be saved,
the first thing to do is to determine where the data should go.
This decision depends mainly on what is going to be done with the
data. For example, if the data is no longer going to be actively
used, it should be archived, probably in the same manner as your
system backups. This means that now is the time to consider
appropriate retention periods for this final backup.
Tip
Keep in mind that, in addition to any data retention
guidelines your organization may have, there may also be legal
requirements for retaining data for a certain length of time.
Therefore, make sure you consult with the department that had
been responsible for the data while it was still in use; they
should know the appropriate retention period.
On the other hand, if the data is still being used, then the
data should reside on the system most appropriate for that usage.
Of course, if this is the case, perhaps it would be easiest to
move the data by reinstalling the disk drive on the new system.
If you do this, you should make a full backup of the data before
doing so — people have dropped disk drives full of valuable
data (losing everything) while doing nothing more hazardous than
walking across a data center.
No matter whether the disk drive has valuable data or not, it
is a good idea to always erase a disk drive's contents prior to
reassigning or relinquishing control of it. While the obvious
reason is to make sure that no sensitive information remains on
the disk drive, it is also a good time to check the disk drive's
health by performing a read-write test for bad blocks over the
entire drive.
Important
Many companies (and government agencies) have specific
methods of erasing data from disk drives and other data storage
media. You should always be sure you
understand and abide by these requirements; in many cases there
are legal ramifications if you fail to do so. The example above
should in no way be considered the ultimate method of wiping a
disk drive.
In addition, organizations that work with classified data
may find that the final disposition of the disk drive may be
subject to certain legally-mandated procedures (such as physical
destruction of the drive). In these instances your
organization's security department should be able to offer
guidance in this matter.