Monitoring Network Performance
Table 30-1 describes the commands that are available for monitoring network performance.
Table 30-1 Network Monitoring Commands
Command |
Description |
ping |
Look at the
response of hosts on the network. |
spray |
Test the reliability of your packet sizes.
This command can tell you whether the network is delaying packets or dropping
packets. |
snoop |
Capture packets from the network and trace the calls from each client to
each server. |
netstat |
Display network status, including state of the interfaces that are used for
TCP/IP traffic, the IP routing table, and the per-protocol statistics for UDP, TCP,
ICMP, and IGMP. |
nfsstat |
Display a summary of server and client statistics that can be
used to identify NFS problems. |
How to Check the Response of Hosts on the Network
Check the response of hosts on the network with the ping command.
$ ping hostname
If you suspect a physical problem, you can use ping to find
the response time of several hosts on the network. If the response from
one host is not what you would expect, you can investigate that host.
Physical problems could be caused by the following:
For more information about this command, see ping(1M).
Example 30-1 Checking the Response of Hosts on the Network
The simplest version of ping sends a single packet to a host on
the network. If ping receives the correct response, the command prints the message
host is alive.
$ ping elvis
elvis is alive
With the -s option, ping sends one datagram per second to a host.
The command then prints each response and the time that was required for
the round trip. An example follows.
$ ping -s pluto
64 bytes from pluto (123.456.78.90): icmp_seq=0. time=3.82 ms
64 bytes from pluto (123.456.78.90): icmp_seq=5. time=0.947 ms
64 bytes from pluto (123.456.78.90): icmp_seq=6. time=0.855 ms
^C
----pluto PING Statistics----
3 packets transmitted, 3 packets received, 0% packet loss
round-trip (ms) min/avg/max/sttdev = 0.855/1.87/3.82/1.7
How to Send Packets to Hosts on the Network
Test the reliability of your packet sizes with the spray command.
$ spray [ -c count -d interval -l packet-size] hostname
- -i count
Number of packets to send.
- -d interval
Number of microseconds to pause between sending packets. If you do not use a delay, you might deplete the buffers.
- -l packet-size
Is the packet size.
- hostname
Is the system to send packets.
For more information about this command, see spray(1M).
Example 30-2 Sending Packets to Hosts on the Network
The following example sends 100 packets to a host (-c 100), with a
packet size of 2048 bytes (-l 2048). The packets are sent with a delay
time of 20 microseconds between each burst (-d 20).
$ spray -c 100 -d 20 -l 2048 pluto
sending 100 packets of length 2048 to pluto ...
no packets dropped by pluto
279 packets/sec, 573043 bytes/sec
How to Capture Packets From the Network
To capture packets from the network and trace the calls from each client
to each server, use snoop. This command provides accurate timestamps that enable some
network performance problems to be isolated quickly. For more information, see snoop(1M).
# snoop
Dropped packets could be caused by insufficient buffer space or an overloaded CPU.
How to Check the Network Status
To display network status information, such as statistics about the state of network
interfaces, routing tables, and various protocols, use the netstat command.
$ netstat [-i] [-r] [-s]
- -i
Displays the state of the TCP/IP interfaces
- -r
Displays the IP routing table
- -s
Displays statistics for the UDP, TCP, ICMP, and IGMP protocols
For more information, see netstat(1M).
Examples–Checking the Network Status
The following example shows output from the netstat -i command, which displays the
state of the interfaces that are used for TCP/IP traffic.
$ netstat -i
Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue
lo0 8232 software localhost 1280 0 1280 0 0 0
eri0 1500 loopback venus 1628480 0 347070 16 39354 0
This display shows the number of packets that a machine has transmitted and
has received on each interface. A machine with active network traffic should show
both Ipkts and Opkts continually increasing.
Calculate the network collisions rate by dividing the number of collision counts (Collis)
by the number of out packets (Opkts). In the previous example, the collision rate
is 11 percent. A network-wide collision rate that is greater than 5 to
10 percent can indicate a problem.
Calculate the error rate for the input packets by dividing the number of
input errors by the total number of input packets (Ierrs/Ipkts). The error rate
for the output packets is the number of output errors divided by the
total number of output packets (Oerrs/Opkts). If the input error rate is high, at
over 0.25 percent, the host might be dropping packets.
The following example shows output from the netstat -s command, which displays
the per-protocol statistics for the UDP, TCP, ICMP, and IGMP protocols.
UDP
udpInDatagrams =196543 udpInErrors = 0
udpOutDatagrams =187820
TCP
tcpRtoAlgorithm = 4 tcpRtoMin = 200
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens = 26952 tcpPassiveOpens = 420
tcpAttemptFails = 1133 tcpEstabResets = 9
tcpCurrEstab = 31 tcpOutSegs =3957636
tcpOutDataSegs =2731494 tcpOutDataBytes =1865269594
tcpRetransSegs = 36186 tcpRetransBytes =3762520
tcpOutAck =1225849 tcpOutAckDelayed =165044
tcpOutUrg = 7 tcpOutWinUpdate = 315
tcpOutWinProbe = 0 tcpOutControl = 56588
tcpOutRsts = 803 tcpOutFastRetrans = 741
tcpInSegs =4587678
tcpInAckSegs =2087448 tcpInAckBytes =1865292802
tcpInDupAck =109461 tcpInAckUnsent = 0
tcpInInorderSegs =3877639 tcpInInorderBytes =-598404107
tcpInUnorderSegs = 14756 tcpInUnorderBytes =17985602
tcpInDupSegs = 34 tcpInDupBytes = 32759
tcpInPartDupSegs = 212 tcpInPartDupBytes =134800
tcpInPastWinSegs = 0 tcpInPastWinBytes = 0
tcpInWinProbe = 456 tcpInWinUpdate = 0
tcpInClosed = 99 tcpRttNoUpdate = 6862
tcpRttUpdate =435097 tcpTimRetrans = 15065
tcpTimRetransDrop = 67 tcpTimKeepalive = 763
tcpTimKeepaliveProbe= 1 tcpTimKeepaliveDrop = 0
IP
ipForwarding = 2 ipDefaultTTL = 255
ipInReceives =11757234 ipInHdrErrors = 0
ipInAddrErrors = 0 ipInCksumErrs = 0
ipForwDatagrams = 0 ipForwProhibits = 0
ipInUnknownProtos = 0 ipInDiscards = 0
ipInDelivers =4784901 ipOutRequests =4195180
ipOutDiscards = 0 ipOutNoRoutes = 0
ipReasmTimeout = 60 ipReasmReqds = 8723
ipReasmOKs = 7565 ipReasmFails = 1158
ipReasmDuplicates = 7 ipReasmPartDups = 0
ipFragOKs = 19938 ipFragFails = 0
ipFragCreates =116953 ipRoutingDiscards = 0
tcpInErrs = 0 udpNoPorts =6426577
udpInCksumErrs = 0 udpInOverflows = 473
rawipInOverflows = 0
ICMP
icmpInMsgs =490338 icmpInErrors = 0
icmpInCksumErrs = 0 icmpInUnknowns = 0
icmpInDestUnreachs = 618 icmpInTimeExcds = 314
icmpInParmProbs = 0 icmpInSrcQuenchs = 0
icmpInRedirects = 313 icmpInBadRedirects = 5
icmpInEchos = 477 icmpInEchoReps = 20
icmpInTimestamps = 0 icmpInTimestampReps = 0
icmpInAddrMasks = 0 icmpInAddrMaskReps = 0
icmpInFragNeeded = 0 icmpOutMsgs = 827
icmpOutDrops = 103 icmpOutErrors = 0
icmpOutDestUnreachs = 94 icmpOutTimeExcds = 256
icmpOutParmProbs = 0 icmpOutSrcQuenchs = 0
icmpOutRedirects = 0 icmpOutEchos = 0
icmpOutEchoReps = 477 icmpOutTimestamps = 0
icmpOutTimestampReps= 0 icmpOutAddrMasks = 0
icmpOutAddrMaskReps = 0 icmpOutFragNeeded = 0
icmpInOverflows = 0
IGMP:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 membership queries received
0 membership queries received with invalid field(s)
0 membership reports received
0 membership reports received with invalid field(s)
0 membership reports received for groups to which we belong
0 membership reports sent
The following example shows output from the netstat -r command, which
displays the IP routing table.
Routing Table:
Destination Gateway Flags Ref Use Interface
------------------ -------------------- ----- ----- ------ ---------
localhost localhost UH 0 2817 lo0
earth-bb pluto U 3 14293 eri0
224.0.0.0 pluto U 3 0 eri0
default mars-gate UG 0 14142
The fields in the netstat -r report are described in Table 30-2.
Table 30-2 Output From the netstat -r Command
Field Name |
|
Description |
Flags
|
U G H D |
The route is up. The route is through a gateway. The route is to
a host. The route was dynamically created by using a redirect. |
Ref |
|
Shows the
current number of routes that share the same link layer. |
Use |
|
Indicates the number
of packets that were sent out. |
Interface |
|
Lists the network interface that is
used for the route. |
How to Display NFS Server and Client Statistics
The NFS distributed file service uses a remote procedure call (RPC) facility that
translates local commands into requests for the remote host. The remote procedure calls
are synchronous. The client application is blocked or suspended until the server has completed
the call and has returned the results. One of the major factors that
affects NFS performance is the retransmission rate.
If the file server cannot respond to a client's request, the client retransmits
the request a specified number of times before the client quits. Each retransmission
imposes system overhead and increases network traffic. Excessive retransmissions can cause network performance problems.
If the retransmission rate is high, you could look for the following:
Overloaded servers that complete requests too slowly
An Ethernet interface that is dropping packets
Network congestion, which slows the packet transmission
Table 30-3 describes the nfsstat options to display client and server statistics.
Table 30-3 Commands for Displaying Client/Server Statistics
Command |
Display |
nfsstat -c |
Client statistics |
nfsstat -s |
Server statistics |
netstat -m |
Network statistics
for each file system |
Use nfsstat -c to show client statistics, and nfsstat -s to show server statistics. Use
netstat -m to display network statistics for each file system. For more information, see
nfsstat(1M).
Examples–Displaying NFS Server and Client Statistics
The following example displays RPC and NFS data for the client pluto.
$ nfsstat -c
Client rpc:
Connection oriented:
calls badcalls badxids timeouts newcreds badverfs timers
1595799 1511 59 297 0 0 0
cantconn nomem interrupts
1198 0 7
Connectionless:
calls badcalls retrans badxids timeouts newcreds badverfs
80785 3135 25029 193 9543 0 0
timers nomem cantsend
17399 0 0
Client nfs:
calls badcalls clgets cltoomany
1640097 3112 1640097 0
Version 2: (46366 calls)
null getattr setattr root lookup readlink read
0 0% 6589 14% 2202 4% 0 0% 11506 24% 0 0% 7654 16%
wrcache write create remove rename link symlink
0 0% 13297 28% 1081 2% 0 0% 0 0% 0 0% 0 0%
mkdir rmdir readdir statfs
24 0% 0 0% 906 1% 3107 6%
Version 3: (1585571 calls)
null getattr setattr lookup access readlink read
0 0% 508406 32% 10209 0% 263441 16% 400845 25% 3065 0% 117959 7%
write create mkdir symlink mknod remove rmdir
69201 4% 7615 0% 42 0% 16 0% 0 0% 7875 0% 51 0%
rename link readdir readdir+ fsstat fsinfo pathconf
929 0% 597 0% 3986 0% 185145 11% 942 0% 300 0% 583 0%
commit
4364 0%
Client nfs_acl:
Version 2: (3105 calls)
null getacl setacl getattr access
0 0% 0 0% 0 0% 3105 100% 0 0%
Version 3: (5055 calls)
null getacl setacl
0 0% 5055 100% 0 0%
The output of the nfsstat -c command is described in Table 30-4.
Table 30-4 Output From the nfsstat -c Command
Field |
Description |
calls |
The total number of
calls that were sent. |
badcalls |
The total number of calls that were rejected by
RPC. |
retrans |
The total number of retransmissions. For this client, the number of retransmissions is
less than 1 percent, or approximately 10 timeouts out of 6888 calls. These
retransmissions might be caused by temporary failures. Higher rates might indicate a problem. |
badxid |
The
number of times that a duplicate acknowledgment was received for a single NFS
request. |
timeout |
The number of calls that timed out. |
wait |
The number of times a
call had to wait because no client handle was available. |
newcred |
The number of
times the authentication information had to be refreshed. |
timers |
The number of times the
time-out value was greater than or equal to the specified time-out value for
a call. |
readlink |
The number of times a read was made to a symbolic
link. If this number is high, at over 10 percent, then there could
be too many symbolic links. |
The following example shows output from the nfsstat -m command.
pluto$ nfsstat -m
/usr/man from pluto:/export/svr4/man
Flags: vers=2,proto=udp,auth=unix,hard,intr,dynamic,
rsize=8192, wsize=8192,retrans=5
Lookups: srtt=13 (32ms), dev=10 (50ms), cur=6 (120ms)
All: srtt=13 (32ms), dev=10 (50ms), cur=6 (120ms)
This output of the nfsstat -m command, which is displayed in milliseconds, is described
in Table 30-5.
Table 30-5 Output From the nfsstat -m Command
Field |
Description |
srtt |
The smoothed average of the round-trip times |
dev |
The average deviations |
cur |
The current
“expected” response time |
If you suspect that the hardware components of your network are creating problems,
you need to look closely at the cabling and connectors.