Commands for Troubleshooting NFS Problems
These commands can be useful when troubleshooting NFS problems.
nfsstat Command
You can use this command to gather statistical information about NFS and RPC
connections. The syntax of the command is as follows:
nfsstat [ -cmnrsz ]
- -c
Displays client-side information
- -m
Displays statistics for each NFS-mounted file system
- -n
Specifies that NFS information is to be displayed on both the client side and the server side
- -r
Displays RPC statistics
- -s
Displays the server-side information
- -z
Specifies that the statistics should be set to zero
If no options are supplied on the command line, the -cnrs options are
used.
Gathering server-side statistics can be important for debugging problems when new software or
new hardware is added to the computing environment. Running this command a minimum
of once a week, and storing the numbers, provides a good history of
previous performance.
Refer to the following example:
# nfsstat -s
Server rpc:
Connection oriented:
calls badcalls nullrecv badlen xdrcall dupchecks dupreqs
719949194 0 0 0 0 58478624 33
Connectionless:
calls badcalls nullrecv badlen xdrcall dupchecks dupreqs
73753609 0 0 0 0 987278 7254
Server nfs:
calls badcalls
787783794 3516
Version 2: (746607 calls)
null getattr setattr root lookup readlink read
883 0% 60 0% 45 0% 0 0% 177446 23% 1489 0% 537366 71%
wrcache write create remove rename link symlink
0 0% 1105 0% 47 0% 59 0% 28 0% 10 0% 9 0%
mkdir rmdir readdir statfs
26 0% 0 0% 27926 3% 108 0%
Version 3: (728863853 calls)
null getattr setattr lookup access
1365467 0% 496667075 68% 8864191 1% 66510206 9% 19131659 2%
readlink read write create mkdir
414705 0% 80123469 10% 18740690 2% 4135195 0% 327059 0%
symlink mknod remove rmdir rename
101415 0% 9605 0% 6533288 0% 111810 0% 366267 0%
link readdir readdirplus fsstat fsinfo
2572965 0% 519346 0% 2726631 0% 13320640 1% 60161 0%
pathconf commit
13181 0% 6248828 0%
Version 4: (54871870 calls)
null compound
266963 0% 54604907 99%
Version 4: (167573814 operations)
reserved access close commit
0 0% 2663957 1% 2692328 1% 1166001 0%
create delegpurge delegreturn getattr
167423 0% 0 0% 1802019 1% 26405254 15%
getfh link lock lockt
11534581 6% 113212 0% 207723 0% 265 0%
locku lookup lookupp nverify
230430 0% 11059722 6% 423514 0% 21386866 12%
open openattr open_confirm open_downgrade
2835459 1% 4138 0% 18959 0% 3106 0%
putfh putpubfh putrootfh read
52606920 31% 0 0% 35776 0% 4325432 2%
readdir readlink remove rename
606651 0% 38043 0% 560797 0% 248990 0%
renew restorefh savefh secinfo
2330092 1% 8711358 5% 11639329 6% 19384 0%
setattr setclientid setclientid_confirm verify
453126 0% 16349 0% 16356 0% 2484 0%
write release_lockowner illegal
3247770 1% 0 0% 0 0%
Server nfs_acl:
Version 2: (694979 calls)
null getacl setacl getattr access getxattrdir
0 0% 42358 6% 0 0% 584553 84% 68068 9% 0 0%
Version 3: (2465011 calls)
null getacl setacl getxattrdir
0 0% 1293312 52% 1131 0% 1170568 47%
The previous listing is an example of NFS server statistics. The first five
lines relate to RPC and the remaining lines report NFS activities. In both
sets of statistics, knowing the average number of badcalls or calls and
the number of calls per week can help identify a problem. The badcalls
value reports the number of bad messages from a client. This value can
indicate network hardware problems.
Some of the connections generate write activity on the disks. A sudden increase
in these statistics could indicate trouble and should be investigated. For NFS version
2 statistics, the connections to note are setattr, write, create, remove, rename,
link, symlink, mkdir, and rmdir. For NFS version 3 and version 4 statistics, the
value to watch is commit. If the commit level is high in one
NFS server, compared to another almost identical server, check that the NFS clients
have enough memory. The number of commit operations on the server grows when
clients do not have available resources.
pstack Command
This command displays a stack trace for each process. The pstack command must
be run by the owner of the process or by root. You can
use pstack to determine where a process is hung. The only option that
is allowed with this command is the PID of the process that you
want to check. See the proc(1) man page.
The following example is checking the nfsd process that is running.
# /usr/bin/pgrep nfsd
243
# /usr/bin/pstack 243
243: /usr/lib/nfs/nfsd -a 16
ef675c04 poll (24d50, 2, ffffffff)
000115dc ???????? (24000, 132c4, 276d8, 1329c, 276d8, 0)
00011390 main (3, efffff14, 0, 0, ffffffff, 400) + 3c8
00010fb0 _start (0, 0, 0, 0, 0, 0) + 5c
The example shows that the process is waiting for a new connection
request, which is a normal response. If the stack shows that the
process is still in poll after a request is made, the process might
be hung. Follow the instructions in How to Restart NFS Services to fix this problem. Review the instructions
in NFS Troubleshooting Procedures to fully verify that your problem is a hung program.
rpcinfo Command
This command generates information about the RPC service that is running on a
system. You can also use this command to change the RPC service.
Many options are available with this command. See the rpcinfo(1M) man page. The
following is a shortened synopsis for some of the options that you can
use with the command.
rpcinfo [ -m | -s ] [ hostname ]
rpcinfo -T transport hostname [ progname ]
rpcinfo [ -t | -u ] [ hostname ] [ progname ]
- -m
Displays a table of statistics of the rpcbind operations
- -s
Displays a concise list of all registered RPC programs
- -T
Displays information about services that use specific transports or protocols
- -t
Probes the RPC programs that use TCP
- -u
Probes the RPC programs that use UDP
- transport
Selects the transport or protocol for the services
- hostname
Selects the host name of the server that you need information from
- progname
Selects the RPC program to gather information about
If no value is given for hostname, the local host name is used.
You can substitute the RPC program number for progname, but many users
can remember the name and not the number. You can use the -p
option in place of the -s option on those systems that do not
run the NFS version 3 software.
The data that is generated by this command can include the following:
The RPC program number
The version number for a specific program
The transport protocol that is being used
The name of the RPC service
The owner of the RPC service
The following example gathers information about the RPC services that are running on
a server. The text that is generated by the command is filtered by
the sort command to make the output more readable. Several lines that list
RPC services have been deleted from the example.
% rpcinfo -s bee |sort -n
program version(s) netid(s) service owner
100000 2,3,4 udp6,tcp6,udp,tcp,ticlts,ticotsord,ticots rpcbind superuser
100001 4,3,2 ticlts,udp,udp6 rstatd superuser
100002 3,2 ticots,ticotsord,tcp,tcp6,ticlts,udp,udp6 rusersd superuser
100003 3,2 tcp,udp,tcp6,udp6 nfs superuser
100005 3,2,1 ticots,ticotsord,tcp,tcp6,ticlts,udp,udp6 mountd superuser
100007 1,2,3 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 ypbind superuser
100008 1 ticlts,udp,udp6 walld superuser
100011 1 ticlts,udp,udp6 rquotad superuser
100012 1 ticlts,udp,udp6 sprayd superuser
100021 4,3,2,1 tcp,udp,tcp6,udp6 nlockmgr superuser
100024 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 status superuser
100029 3,2,1 ticots,ticotsord,ticlts keyserv superuser
100068 5 tcp,udp cmsd superuser
100083 1 tcp,tcp6 ttdbserverd superuser
100099 3 ticotsord autofs superuser
100133 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser
100134 1 ticotsord tokenring superuser
100155 1 ticots,ticotsord,tcp,tcp6 smserverd superuser
100221 1 tcp,tcp6 - superuser
100227 3,2 tcp,udp,tcp6,udp6 nfs_acl superuser
100229 1 tcp,tcp6 metad superuser
100230 1 tcp,tcp6 metamhd superuser
100231 1 ticots,ticotsord,ticlts - superuser
100234 1 ticotsord gssd superuser
100235 1 tcp,tcp6 - superuser
100242 1 tcp,tcp6 metamedd superuser
100249 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser
300326 4 tcp,tcp6 - superuser
300598 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser
390113 1 tcp - unknown
805306368 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser
1289637086 1,5 tcp - 26069
The following two examples show how to gather information about a particular RPC
service by selecting a particular transport on a server. The first example checks
the mountd service that is running over TCP. The second example checks the
NFS service that is running over UDP.
% rpcinfo -t bee mountd
program 100005 version 1 ready and waiting
program 100005 version 2 ready and waiting
program 100005 version 3 ready and waiting
% rpcinfo -u bee nfs
program 100003 version 2 ready and waiting
program 100003 version 3 ready and waiting
snoop Command
This command is often used to watch for packets on the network.
The snoop command must be run as root. The use of this command
is a good way to ensure that the network hardware is functioning on
both the client and the server. Many options are available. See the
snoop(1M) man page. A shortened synopsis of the command follows:
snoop [ -d device ] [ -o filename ] [ host
hostname ]
- -d device
Specifies the local network interface
- -o filename
Stores all the captured packets into the named file
- hostname
Displays packets going to and from a specific host only
The -d device option is useful on those servers that have multiple network
interfaces. You can use many expressions other than setting the host. A
combination of command expressions with grep can often generate data that is
specific enough to be useful.
When troubleshooting, make sure that packets are going to and from the proper
host. Also, look for error messages. Saving the packets to a file
can simplify the review of the data.
truss Command
You can use this command to check if a process is hung.
The truss command must be run by the owner of the process or
by root. You can use many options with this command. See the truss(1)
man page. A shortened syntax of the command follows.
truss [ -t syscall ] -p pid
- -t syscall
Selects system calls to trace
- -p pid
Indicates the PID of the process to be traced
The syscall can be a comma-separated list of system calls to be traced.
Also, starting syscall with an ! selects to exclude the listed system calls
from the trace.
This example shows that the process is waiting for another connection request from
a new client.
# /usr/bin/truss -p 243
poll(0x00024D50, 2, -1) (sleeping...)
The previous example shows a normal response. If the response does not change
after a new connection request has been made, the process could be hung.
Follow the instructions in How to Restart NFS Services to fix the hung program. Review the
instructions in NFS Troubleshooting Procedures to fully verify that your problem is a hung program.