CFSMount resource going offline unexpectedly following LUN addition and vxresize

book

Article ID: 100023689

calendar_today

Updated On:

Description

Error Message

WARNING: msgcnt 24 mesg 061: V-2-61: vx_resize - /dev/vx/dsk/tstdg/datavol 1 file system remount failed

NOTICE: msgcnt 23 IOERROR - Type II (primary disable) detected.

WARNING: msgcnt 21 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/tstdg/datavol1file system disabled

..

NOTICE: msgcnt 32 IOERROR - type IV (all disable) detected

 

Cause

With CVM Volume Asymmetry enabled and IO shipping for application IO disabled, under certain scenario wherein new LUNs are not presented to all CVM nodes but are added to the shared DG, the application IO failures on such missing disks can lead to Cluster FS getting disabled unexpectedly.

 

Resolution

Enable application IO  shipping for shared DG  by setting the tunable "ioship" to ON.

# vxdg –g testdg set ioship=on

Note: The network bandwidth is utilized to achieve IO shipping. The application IO shipping can utilize a good amount of network bandwidth which can cause observable degradation in performance. So customers need to evaluate this before deciding to use the application IO shipping.

Workaround

Setting the storage_connectivity tunable to pre SFCFS 6.1 value of resilient could prevent the disk addition in CVM, if any secondary nodes cannot see the shared disks.

Instead, it is recommended to check all the CVM nodes for LUN connectivity before performing any changes to the DG configuration when storage_connectivity is tunable is set to asymmetric.

 

Issue/Introduction

From 6.0.1, Storage Foundation Cluster File System(SFCFS) supports Cluster Volume Manager(CVM) Volume Asymmetry, whereby disks can be added to the CVM shared Disk Group(DG) even though those disks are missing in some CVM nodes. The missing disks in the DG on the secondary node will be converted into Lmissing disks. Subsequent application IO operations on such missing disks will fail if the ‘ioship’ is disabled, hence Cluster File System(CFS) may get disabled. Background In a CVM cluster, a transient disk error during the DG configuration updates can result in disabling the DG. Likewise, depending on the detach policy, the volume or plex can be detached locally or globally due to application IO error. With the introduction of CVM Volume Asymmetry, disk group import or node join operations would succeed even if some node has storage connectivity issues. This allows applications to use the other volumes or disk groups that are still accessible. Also, it enables the addition of disks to shared DG even when some nodes cannot access the newly added disks. # vxtune storage_connectivity Tunable Current Value Default Value Reboot Clusterwide ------------------------------- ------------- ------------- ------ ----------- storage_connectivity asymmetric asymmetric N Y The application IO shipping feature improves resiliency across Cluster File System nodes reducing downtime and improving resiliency. I/O Shipping is about being able to ship I/O from one cluster node to another when a problem in the I/O path occurs. Once enabled, this works transparently for the user, automatically reacting to failures. This feature can be enabled for any shared DG by setting the tunable "ioship" to ON. The network bandwidth is utilized to achieve the IO shipping. The application IO shipping can utilize good amount of network bandwidth which can cause observable degradation in the performance. So customers need to evaluate this before deciding to use the application IO shipping. # vxdg list testdg Group: testdg ioship: off >>> Application IO shipping disabled by default. ...