Linux System Administrators Guide:
Prev	Chapter 7. System Monitoring	Next

7.1. System Resources

Being able to monitor the performance of your system is essential. If system resources become to low it can cause a lot of problems. System resources can be taken up by individual users, or by services your system may host such as email or web pages. The ability to know what is happening can help determine whether system upgrades are needed, or if some services need to be moved to another machine.

7.1.1. The top command.

The most common of these commands is top. The top will display a continually updating report of system resource usage.
# top 12:10:49 up 1 day, 3:47, 7 users, load average: 0.23, 0.19, 0.10 125 processes: 105 sleeping, 2 running, 18 zombie, 0 stopped CPU states: 5.1% user 1.1% system 0.0% nice 0.0% iowait 93.6% idle Mem: 512716k av, 506176k used, 6540k free, 0k shrd, 21888k buff Swap: 1044216k av, 161672k used, 882544k free 199388k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 2330 root 15 0 161M 70M 2132 S 4.9 14.0 1000m 0 X 2605 weeksa 15 0 8240 6340 3804 S 0.3 1.2 1:12 0 kdeinit 3413 weeksa 15 0 6668 5324 3216 R 0.3 1.0 0:20 0 kdeinit 18734 root 15 0 1192 1192 868 R 0.3 0.2 0:00 0 top 1619 root 15 0 776 608 504 S 0.1 0.1 0:53 0 dhclient 1 root 15 0 480 448 424 S 0.0 0.0 0:03 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd 4 root 35 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 9 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd 10 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 15 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kjournald 81 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd 1188 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 1675 root 15 0 604 572 520 S 0.0 0.1 0:00 0 syslogd 1679 root 15 0 428 376 372 S 0.0 0.0 0:00 0 klogd 1707 rpc 15 0 516 440 436 S 0.0 0.0 0:00 0 portmap 1776 root 25 0 476 428 424 S 0.0 0.0 0:00 0 apmd 1813 root 25 0 752 528 524 S 0.0 0.1 0:00 0 sshd 1828 root 25 0 704 548 544 S 0.0 0.1 0:00 0 xinetd 1847 ntp 15 0 2396 2396 2160 S 0.0 0.4 0:00 0 ntpd 1930 root 24 0 76 4 0 S 0.0 0.0 0:00 0 rpc.rquotad

The top portion of the report lists information such as the system time, uptime, CPU usage, physical ans swap memory usage, and number of processes. Below that is a list of the processes sorted by CPU utilization.

You can modify the output of top while is is running. If you hit an i, top will no longer display idle processes. Hit i again to see them again. Hitting M will sort by memory usage, S will sort by how long they processes have been running, and P will sort by CPU usage again.

In addition to viewing options, you can also modify processes from within the top command. You can use u to view processes owned by a specific user, k to kill processes, and r to renice them.

For more in-depth information about processes you can look in the /proc filesystem. In the /proc filesystem you will find a series of sub-directories with numeric names. These directories are associated with the processes ids of currently running processes. In each directory you will find a series of files containing information about the process.

YOU MUST TAKE EXTREME CAUTION TO NOT MODIFY THESE FILES, DOING SO MAY CAUSE SYSTEM PROBLEMS!

7.1.2. The iostat command.

The iostat will display the current CPU load average and disk I/O information. This is a great command to monitor your disk I/O usage.
# iostat Linux 2.4.20-24.9 (myhost) 12/23/2003 avg-cpu: %user %nice %sys %idle 62.09 0.32 2.97 34.62 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev3-0 2.22 15.20 47.16 1546846 4799520
For 2.4 kernels the devices is names using the device's major and minor number. In this case the device listed is /dev/hda. To have iostat print this out for you, use the -x.
# iostat -x Linux 2.4.20-24.9 (myhost) 12/23/2003 avg-cpu: %user %nice %sys %idle 62.01 0.32 2.97 34.71 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/hdc 0.00 0.00 .00 0.00 0.00 0.00 0.00 0.00 0.00 2.35 0.00 0.00 14.71 /dev/hda 1.13 4.50 .81 1.39 15.18 47.14 7.59 23.57 28.24 1.99 63.76 70.48 15.56 /dev/hda1 1.08 3.98 .73 1.27 14.49 42.05 7.25 21.02 28.22 0.44 21.82 4.97 1.00 /dev/hda2 0.00 0.51 .07 0.12 0.55 5.07 0.27 2.54 30.35 0.97 52.67 61.73 2.99 /dev/hda3 0.05 0.01 .02 0.00 0.14 0.02 0.07 0.01 8.51 0.00 12.55 2.95 0.01

The iostat man page contains a detailed explanation of what each of these columns mean.

7.1.3. The ps command

The ps will provide you a list of processes currently running. There is a wide variety of options that this command gives you.

A common use would be to list all processes currently running. To do this you would use the ps -ef command. (Screen output from this command is too large to include, the following is only a partial output.)
UID PID PPID C STIME TTY TIME CMD root 1 0 0 Dec22 ? 00:00:03 init root 2 1 0 Dec22 ? 00:00:00 [keventd] root 3 1 0 Dec22 ? 00:00:00 [kapmd] root 4 1 0 Dec22 ? 00:00:00 [ksoftirqd_CPU0] root 9 1 0 Dec22 ? 00:00:00 [bdflush] root 5 1 0 Dec22 ? 00:00:00 [kswapd] root 6 1 0 Dec22 ? 00:00:00 [kscand/DMA] root 7 1 0 Dec22 ? 00:01:28 [kscand/Normal] root 8 1 0 Dec22 ? 00:00:00 [kscand/HighMem] root 10 1 0 Dec22 ? 00:00:00 [kupdated] root 11 1 0 Dec22 ? 00:00:00 [mdrecoveryd] root 15 1 0 Dec22 ? 00:00:01 [kjournald] root 81 1 0 Dec22 ? 00:00:00 [khubd] root 1188 1 0 Dec22 ? 00:00:00 [kjournald] root 1675 1 0 Dec22 ? 00:00:00 syslogd -m 0 root 1679 1 0 Dec22 ? 00:00:00 klogd -x rpc 1707 1 0 Dec22 ? 00:00:00 portmap root 1813 1 0 Dec22 ? 00:00:00 /usr/sbin/sshd ntp 1847 1 0 Dec22 ? 00:00:00 ntpd -U ntp root 1930 1 0 Dec22 ? 00:00:00 rpc.rquotad root 1934 1 0 Dec22 ? 00:00:00 [nfsd] root 1942 1 0 Dec22 ? 00:00:00 [lockd] root 1943 1 0 Dec22 ? 00:00:00 [rpciod] root 1949 1 0 Dec22 ? 00:00:00 rpc.mountd root 1961 1 0 Dec22 ? 00:00:00 /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf root 2057 1 0 Dec22 ? 00:00:00 /usr/bin/spamd -d -c -a root 2066 1 0 Dec22 ? 00:00:00 gpm -t ps/2 -m /dev/psaux bin 2076 1 0 Dec22 ? 00:00:00 /usr/sbin/cannaserver -syslog -u bin root 2087 1 0 Dec22 ? 00:00:00 crond daemon 2195 1 0 Dec22 ? 00:00:00 /usr/sbin/atd root 2215 1 0 Dec22 ? 00:00:11 /usr/sbin/rcd weeksa 3414 3413 0 Dec22 pts/1 00:00:00 /bin/bash weeksa 4342 3413 0 Dec22 pts/2 00:00:00 /bin/bash weeksa 19121 18668 0 12:58 pts/2 00:00:00 ps -ef

The first column shows who owns the process. The second column is the process ID. The Third column is the parent process ID. This is the process that generated, or started, the process. The forth column is the CPU usage (in percent). The fifth column is the start time, of date if the process has been running long enough. The sixth column is the tty associated with the process, if applicable. The seventh column is the cumulitive CPU usage (total amount of CPU time is has used while running). The eighth column is the command itself.

With this information you can see exacly what is running on your system and kill run-away processes, or those that are causing problems.

7.1.4. The vmstat command

The vmstat command will provide a report showing statistics for system processes, memory, swap, I/O, and the CPU. These statistics are generated using data from the last time the command was run to the present. In the case of the command never being run, the data will be from the last reboot until the present.

# vmstat procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 181604 17000 26296 201120 0 2 8 24 149 9 61 3 36

The following was taken from the vmstat man page.

FIELD DESCRIPTIONS
Procs
    r: The number of processes waiting for run time.
    b: The number of processes in uninterruptable sleep.
    w: The number of processes swapped out but otherwise runnable.  This
       field is calculated, but Linux never desperation swaps.

Memory
    swpd: the amount of virtual memory used (kB).
    free: the amount of idle memory (kB).
    buff: the amount of memory used as buffers (kB).

Swap
    si: Amount of memory swapped in from disk (kB/s).
    so: Amount of memory swapped to disk (kB/s).

IO
    bi: Blocks sent to a block device (blocks/s).
    bo: Blocks received from a block device (blocks/s).

System
    in: The number of interrupts per second, including the clock.
    cs: The number of context switches per second.

CPU
    These are percentages of total CPU time.
    us: user time
    sy: system time
    id: idle time

7.1.5. The lsof command

The lsof command will print out a list of every file that is in use. Since Linux considers everythihng a file, this list can be very long. However, this command can be useful in diagnosing problems. An example of this is if you wish to unmount a filesystem, but you are being told that it is in use. You could use this command and grep for the name of the filesystem to see who is using it.

Or suppose you want to see all files in use by a particular process. To do this you would use lsof -p -processid-.

7.1.6. Finding More Utilities

To learn more about what command line tools are available, Chris Karakas has wrote a reference guide titled GNU/Linux Command-Line Tools Summary. It's a good resource for learning what tools are out there and how to do a number of tasks.

Prev	Home	Next
System Monitoring	Up	Filesystem Usage