DIAGNOSTIC STEPS:
To configure Debug logging for the VCS NFS Agent (logging will be recorded to /var/VRTSvcs/log/NFS_A.log):
# haconf -makerw
# hatype-modify NFS LogDbg DBG_4 DBG_AGDEBUG DBG_AGINFO
# haconf -dump-makero
To disable VCS NFS Agent debugging:
# haconf -makerw
# hatype -modify NFS LogDbg -delete -keys
# haconf -dump -makero
What is status 5?
From /usr/include/rpc/clnt.h:
RPC_TIMEDOUT=5,/* call timed out */
||| status=5means RPC call timed out.
From VCS perspective:
||| We open a handle to the local host.
||| The call is NULLPROC, the null procedure
||| The null procedure checks status of the RPC call.
||| If the RPC return status is not successful and NFS agent debug level is 4. Then we will print the debug log message to the NFS_A.log.
{VCSAG_LOGDBG_MSG(VCS_DBG4,VCS_DEFAULT_FLAGS, "RPC call failed with status=%d for program=%d, protocol=%s,version=%d", status, program, protocol, version);
The null procedure doesn't do any processing, it is there for diagnostic purposes.
The time-out of the RPC call to NULLPROC suggests the RPC server is occasionally too busy to service the request, resulting in seemingly-random resource faults.
WORKAROUND:
Increase the ToleranceLimit for the NFS resource type to 2.
This allows the monitor entrypoint to return OFFLINE two times before the resource is declared FAULTED.
NOTE: Increasing the Tolerance limit is a workaround for VCS
# hatype -modify NFS ToleranceLimit 2