Umount stops responding, or "hangs," because of hold on a VxFS vnode which is taken by an nfsv4 thread

book

Article ID: 100033642

calendar_today

Updated On:

Description

Error Message


In the engine_A.log file:
 
2017/01/31 09:43:26 VCS WARNING V-16-2-13011 (sysA) Resource(nfs_1-fs): offline procedure did not complete within the expected time.
2017/01/31 09:43:26 VCS ERROR V-16-2-13063 (sysA) Agent is calling clean for resource(nfs_1-fs) because offline did not complete within the expected time.
2017/01/31 09:44:27 VCS ERROR V-16-2-13006 (sysA) Resource(nfs_1-fs): clean procedure did not complete within the expected time.


In the Mount_A.log file:

2017/01/31 09:38:26 VCS NOTICE V-16-10031-5512 Mount:nfs_1-fs:offline:Trying force umount with signal 15...
2017/01/31 09:38:26 VCS NOTICE V-16-10031-5512 Mount:nfs_1-fs:offline:Trying force umount with signal 9...
2017/01/31 09:43:25 VCS WARNING V-16-2-13139 Thread(4109335408) Canceling thread (4110416752)
2017/01/31 09:43:26 VCS INFO V-16-2-13845 Thread(4110416752) Resource(nfs_1-fs): Output of the timed out operation (offline)
==============================================
UX:vxfs umount.vxfs: ERROR: V-3-26299: cannot umount /exports/nfs1: Device or resource busy
UX:vxfs umount.vxfs: ERROR: V-3-26299: cannot umount /exports/nfs1: Device or resource busy


The messages file will show  "nfsd" threads stuck for more than 120 seconds with the same stack:
 
Jan 31 09:42:17 sysA kernel: INFO: task nfsd:30141 blocked for more than 120 seconds.
Jan 31 09:42:17 sysA kernel: INFO: task nfsd:30145 blocked for more than 120 seconds.

 

Cause

Sometimes, when an application is running over NFSv4, the nfsv4 thread increments the vnode holdcount without actually doing any active operations, like reading or writing. While doing an umount of a filesystem, the umount will hang until such time as the vnode hold count reaches zero. However, this will not happen since the holdcount is incremented by the NFSv4 thread and is not handled in the VxFS code.

This hang can happen on both local or CFS mounts.

Support for NFSv4 over VxFS file systems are not supported prior to InfoScale 7.3 releases.

Resolution

Workaround:

If NFSv4 is not specifically required, then it is recommended to use NFSv3. 


Solution:

1. The latest version of the Linux kernel no longer triggers this issue through NFSv4 exports due to a redesign of parts of the NFS server, but that change hasn’t been back ported to RHEL6. It is, however, available in the latest version of RHEL7 (which at this time is RHEL 7.3). 

2. Upgrade to InfoScale 7.3. It provides support for NFSv4. With this added support, Veritas File System (VxFS) now supports user access through NFSv4. Except for tmplog, VxFS complies with the persistence requirements of the NFSv4 standard.
 

Issue/Introduction

The umount command stops responding, or "hangs," because of a hold on a VxFS (Veritas File System) vnode which is taken by a NFSv4 (Network File System version 4) thread.