|
|
|
|
|
NFS Troubleshooting Procedures
To determine where the NFS service has failed, you need to follow
several procedures to isolate the failure. Check for the following items:
Can the client reach the server?
Can the client contact the NFS services on the server?
Are the NFS services running on the server?
In the process of checking these items, you might notice that other portions
of the network are not functioning. For example, the name service or the
physical network hardware might not be functioning. The System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP) contains debugging procedures for
several name services. Also, during the process you might see that the problem
is not at the client end. An example is if you get
at least one trouble call from every subnet in your work area. In
this situation, you should assume that the problem is the server or
the network hardware near the server. So, you should start the debugging
process at the server, not at the client.
How to Check Connectivity on an NFS Client
- Check that the NFS server is reachable from the client. On the client,
type the following command.
% /usr/sbin/ping bee
bee is alive If the command reports that the server is alive, remotely check the NFS
server. See How to Check the NFS Server Remotely.
- If the server is not reachable from the client, ensure that the local
name service is running.
For NIS+ clients, type the following: % /usr/lib/nis/nisping -u
Last updates for directory eng.acme.com. :
Master server is eng-master.acme.com.
Last update occurred at Mon Jun 5 11:16:10 1995
Replica server is eng1-replica-58.acme.com.
Last Update seen was Mon Jun 5 11:16:10 1995
- If the name service is running, ensure that the client has received the
correct host information by typing the following:
% /usr/bin/getent hosts bee
129.144.83.117 bee.eng.acme.com
- If the host information is correct, but the server is not reachable from
the client, run the ping command from another client.
If the command run from a second client fails, see How to Verify the NFS Service on the Server.
- If the server is reachable from the second client, use ping to check
connectivity of the first client to other systems on the local net.
If this command fails, check the networking software configuration on the client, for
example, /etc/netmasks and /etc/nsswitch.conf.
- (Optional) Check the output of the rpcinfo command.
If the rpcinfo command does not display program 100003 version 4 ready and waiting, then NFS version 4 is not
enabled on the server. See Table 5-3 for information about enabling NFS version 4.
- If the software is correct, check the networking hardware.
Try to move the client onto a second net drop.
How to Check the NFS Server RemotelyNote that support for both the UDP and the MOUNT protocols is
not necessary if you are using an NFS version 4 server.
- Check that the NFS services have started on the NFS server by typing
the following command:
% rpcinfo -s bee|egrep 'nfs|mountd'
100003 3,2 tcp,udp,tcp6,upd6 nfs superuser
100005 3,2,1 ticots,ticotsord,tcp,tcp6,ticlts,udp,upd6 mountd superuser If the daemons have not been started, see How to Restart NFS Services.
- Check that the server's nfsd processes are responding.
On the client, type the following command to test the UDP NFS connections
from the server. % /usr/bin/rpcinfo -u bee nfs
program 100003 version 2 ready and waiting
program 100003 version 3 ready and waiting
Note - NFS version 4 does not support UDP.
If the server is running, it prints a list of program and
version numbers. Using the -t option tests the TCP connection. If this command fails,
proceed to How to Verify the NFS Service on the Server.
- Check that the server's mountd is responding, by typing the following command.
% /usr/bin/rpcinfo -u bee mountd
program 100005 version 1 ready and waiting
program 100005 version 2 ready and waiting
program 100005 version 3 ready and waiting If the server is running, it prints a list of program and
version numbers that are associated with the UDP protocol. Using the -t option tests
the TCP connection. If either attempt fails, proceed to How to Verify the NFS Service on the Server.
- Check the local autofs service if it is being used:
% cd /net/wasp Choose a /net or /home mount point that you know should work properly.
If this command fails, then as root on the client, type the
following to restart the autofs service: # svcadm restart system/filesystem/autofs
- Verify that file system is shared as expected on the server.
% /usr/sbin/showmount -e bee
/usr/src eng
/export/share/man (everyone) Check the entry on the server and the local mount entry for
errors. Also, check the namespace. In this instance, if the first client is
not in the eng netgroup, that client cannot mount the /usr/src file system. Check all entries that include mounting information in all the local files. The
list includes /etc/vfstab and all the /etc/auto_* files.
How to Verify the NFS Service on the Server
- Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
To configure a role with the Primary Administrator profile, see Chapter 2, Working With the Solaris Management Console (Tasks), in System Administration Guide: Basic Administration.
- Check that the server can reach the clients.
# ping lilac
lilac is alive
- If the client is not reachable from the server, ensure that the local
name service is running. For NIS+ clients, type the following:
% /usr/lib/nis/nisping -u
Last updates for directory eng.acme.com. :
Master server is eng-master.acme.com.
Last update occurred at Mon Jun 5 11:16:10 1995
Replica server is eng1-replica-58.acme.com.
Last Update seen was Mon Jun 5 11:16:10 1995
- If the name service is running, check the networking software configuration on the
server, for example, /etc/netmasks and /etc/nsswitch.conf.
- Type the following command to check whether the rpcbind daemon is running.
# /usr/bin/rpcinfo -u localhost rpcbind
program 100000 version 1 ready and waiting
program 100000 version 2 ready and waiting
program 100000 version 3 ready and waiting If the server is running, it prints a list of program and
version numbers that are associated with the UDP protocol. If rpcbind seems to be
hung, either reboot the server or follow the steps in How to Warm-Start rpcbind.
- Type the following command to check whether the nfsd daemon is running.
# rpcinfo -u localhost nfs
program 100003 version 2 ready and waiting
program 100003 version 3 ready and waiting
# ps -ef | grep nfsd
root 232 1 0 Apr 07 ? 0:01 /usr/lib/nfs/nfsd -a 16
root 3127 2462 1 09:32:57 pts/3 0:00 grep nfsd
Note - NFS version 4 does not support UDP.
If the server is running, it prints a list of program and
version numbers that are associated with the UDP protocol. Also use the -t
option with rpcinfo to check the TCP connection. If these commands fail, restart the
NFS service. See How to Restart NFS Services.
- Type the following command to check whether the mountd daemon is running.
# /usr/bin/rpcinfo -u localhost mountd
program 100005 version 1 ready and waiting
program 100005 version 2 ready and waiting
program 100005 version 3 ready and waiting
# ps -ef | grep mountd
root 145 1 0 Apr 07 ? 21:57 /usr/lib/autofs/automountd
root 234 1 0 Apr 07 ? 0:04 /usr/lib/nfs/mountd
root 3084 2462 1 09:30:20 pts/3 0:00 grep mountd If the server is running, it prints a list of program and
version numbers that are associated with the UDP protocol. Also use the -t
option with rpcinfo to check the TCP connection. If these commands fail, restart the
NFS service. See How to Restart NFS Services.
How to Restart NFS Services
- Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
To configure a role with the Primary Administrator profile, see Chapter 2, Working With the Solaris Management Console (Tasks), in System Administration Guide: Basic Administration.
- Restart the NFS service on the server.
Type the following command. # svcadm restart network/nfs/server
How to Warm-Start rpcbindIf the NFS server cannot be rebooted because of work in progress, you
can restart rpcbind without having to restart all of the services that use
RPC. Just complete a warm start by following these steps.
- Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
To configure a role with the Primary Administrator profile, see Chapter 2, Working With the Solaris Management Console (Tasks), in System Administration Guide: Basic Administration.
- Determine the PID for rpcbind.
Run ps to get the PID, which is the value in the second
column. # ps -ef |grep rpcbind
root 115 1 0 May 31 ? 0:14 /usr/sbin/rpcbind
root 13000 6944 0 11:11:15 pts/3 0:00 grep rpcbind
- Send a SIGTERM signal to the rpcbind process.
In this example, term is the signal that is to be sent and
115 is the PID for the program (see the kill(1) man page).
This command causes rpcbind to create a list of the current registered
services in /tmp/portmap.file and /tmp/rpcbind.file. # kill -s term 115
Note - If you do not kill the rpcbind process with the -s term option,
you cannot complete a warm start of rpcbind. You must reboot the server
to restore service.
- Restart rpcbind.
Warm-restart the command so that the files that were created by the
kill command are consulted. A warm start also ensures that the process resumes
without requiring a restart of all the RPC services. See the rpcbind(1M)
man page. # /usr/sbin/rpcbind -w
Identifying Which Host Is Providing NFS File Service
Run the nfsstat command with the -m option to gather current NFS information.
The name of the current server is printed after “currserver=”. % nfsstat -m
/usr/local from bee,wasp:/export/share/local
Flags: vers=3,proto=tcp,sec=sys,hard,intr,llock,link,synlink,
acl,rsize=32768,wsize=32678,retrans=5
Failover: noresponse=0, failover=0, remap=0, currserver=bee
How to Verify Options Used With the mount CommandIn the Solaris 2.6 release and in any versions of the mount
command that were patched after the 2.6 release, no warning is issued for
invalid options. The following procedure helps determine whether the options that were supplied either
on the command line or through /etc/vfstab were valid. For this example, assume that the following command has been run: # mount -F nfs -o ro,vers=2 bee:/export/share/local /mnt
- Verify the options by running the following command.
% nfsstat -m
/mnt from bee:/export/share/local
Flags: vers=2,proto=tcp,sec=sys,hard,intr,dynamic,acl,rsize=8192,wsize=8192,
retrans=5 The file system from bee has been mounted with the protocol version set
to 2. Unfortunately, the nfsstat command does not display information about all
of the options. However, using the nfsstat command is the most accurate way to
verify the options.
- Check the entry in /etc/mnttab.
The mount command does not allow invalid options to be added to the
mount table. Therefore, verify that the options that are listed in the file
match those options that are listed on the command line. In this way,
you can check those options that are not reported by the nfsstat
command. # grep bee /etc/mnttab
bee:/export/share/local /mnt nfs ro,vers=2,dev=2b0005e 859934818
|
|
|
|
|