Mis-match in vxfen configuration after storage issue.

Description

Error Message

No errors reported by vxfen. But if vxfen coordpoint resource is configured in the cluster it may generate the error as below:

server101 AgentFramework[42399]: VCS CRITICAL V-16-10061-638 CoordPoint:coordpoint:monitor:Administrator intervention is required because registration key VF01F401 is missing from the coordinator disk /dev/vx/rdmp/sdj.If split brain happens in this condition, all nodes in the cluster may panic. Ensure that the local node can access the coordination disks and the keys are registered. Refer to the vxfenadm (1m) man page for more information.

The coordpoint resource is marked as faulted

# hares -state coordpoint#Resource Attribute System Valuecoordpoint State server101 FAULTED <<
Cause
 In Linux environment when the access to storage is lost, the udev driver removes the devices from OS, subsequent scanning of devices within vxvm would remove the dmpnodes. When the storage is restored , OS and VxVM rebuilds the device tree and it has a potential to assign different minor for the devices . If the device minor numbers are changed for fencing disks, there would be a mismatch in the vxfen configuration.  

Solution
 There are multiple possible solution to address this scenario.



Option 1: Use vxfenswap to correct the vxfen mismatch after a CVM node reconnects to storage. Refer the VCS Admin guide for detailed instruction on ““Replacing I/O Fencing coordinator disks when the cluster is online".  Summarized the keys task for reference.


Make sure system-to-system communication is functioning properly  ( password less ssh ) .
Determine the value of the FaultTolerance attribute.

        # hares -display coordpoint -attribute FaultTolerance -localclus


Set the value of the FaultTolerance attribute to 0.
Check the existing value of the LevelTwoMonitorFreq attribute.

        # hares -display coordpoint -attribute LevelTwoMonitorFreq –localclus


Disable level two monitoring of CoordPoint agent.

        # haconf -makerw

        # hares -modify coordpoint LevelTwoMonitorFreq 0

        # haconf -dump –makero

Make sure that the cluster is online.

        # vxfenadm –d


Validate that fencing disk are accessible in all Running nodes of the  cluster.
On any running node, run the following command to start the vxfenswap utility

        # vxfenswap –g fendgname


Confirm the updates # vxfenconfig –l
Re-enable the LevelTwoMonitorFreq attribute  and FaultTolerance of the CoordPoint agent.You may want to use the value that was set before disabling the attribute.

       # hares -modify coordpoint LevelTwoMonitorFreq Frequencyvalue

       # hares -modify coordpoint FaultTolerance FaultTolerancevalue

 

 

Option 2: Implement CP server based fencing. Refer VCS guides for details.



Option 3: Applicable only for the Fiber Channel based storage access. Increase dev_loss_tmo  to avoid reminor of dmpnodes in the event of temporary storage loss. Having the below modified udev rule during the system boot up,  increases the device timeout to  24 hrs ( 86400) instead of default dev_loss_tmo  of 30 or 45 secs depending upon HBA setting.

Edit the file /etc/udev.d/rules/40-rport.rules and update to change the dev_loss_tmo.

$ cat /etc/udev.d/rules/40-rport.rules

KERNEL=="rport-*",

SUBSYSTEM=="fc_remote_ports",

ACTION=="add",RUN+="/bin/sh -c 'echo 86400 > /sys/class/fc_remote_ports/%k/dev_loss_tmo'"

Cause

In Linux environment when the access to storage is lost, the udev driver removes the devices from OS, subsequent scanning of devices within vxvm would remove the dmpnodes. When the storage is restored , OS and VxVM rebuilds the device tree and it has a potential to assign different minor for the devices . If the device minor numbers are changed for fencing disks, there would be a mismatch in the vxfen configuration.

Resolution

There are multiple possible solution to address this scenario.

Option 1: Use vxfenswap to correct the vxfen mismatch after a CVM node reconnects to storage. Refer the VCS Admin guide for detailed instruction on ““Replacing I/O Fencing coordinator disks when the cluster is online". Summarized the keys task for reference.

Make sure system-to-system communication is functioning properly ( password less ssh ) .
Determine the value of the FaultTolerance attribute.

# hares -display coordpoint -attribute FaultTolerance -localclus

Set the value of the FaultTolerance attribute to 0.
Check the existing value of the LevelTwoMonitorFreq attribute.

# hares -display coordpoint -attribute LevelTwoMonitorFreq –localclus

Disable level two monitoring of CoordPoint agent.

        # haconf -makerw
        # hares -modify coordpoint LevelTwoMonitorFreq 0
        # haconf -dump –makero

Make sure that the cluster is online.

# vxfenadm –d

Validate that fencing disk are accessible in all Running nodes of the cluster.
On any running node, run the following command to start the vxfenswap utility

# vxfenswap –g fendgname

Confirm the updates # vxfenconfig –l
Re-enable the LevelTwoMonitorFreq attribute and FaultTolerance of the CoordPoint agent.You may want to use the value that was set before disabling the attribute.

# hares -modify coordpoint LevelTwoMonitorFreq Frequencyvalue
# hares -modify coordpoint FaultTolerance FaultTolerancevalue

Option 2: Implement CP server based fencing. Refer VCS guides for details.

Option 3: Applicable only for the Fiber Channel based storage access. Increase dev_loss_tmo to avoid reminor of dmpnodes in the event of temporary storage loss. Having the below modified udev rule during the system boot up, increases the device timeout to 24 hrs ( 86400) instead of default dev_loss_tmo of 30 or 45 secs depending upon HBA setting.
Edit the file /etc/udev.d/rules/40-rport.rules and update to change the dev_loss_tmo.
$ cat /etc/udev.d/rules/40-rport.rules
KERNEL=="rport-*",
SUBSYSTEM=="fc_remote_ports",
ACTION=="add",RUN+="/bin/sh -c 'echo 86400 > /sys/class/fc_remote_ports/%k/dev_loss_tmo'"

Issue/Introduction

In Linux Platform with Storage Foundation High Availability clusters (SFHA/SFCFS) using disk based IO Fencing (vxfen), when access to storage on an affected node is restored, without restart of vxfen might result in vxfen configuration mismatch.

Please note that there is no error logged by vxfen, but if CoordPoint resource is configured in the cluster then that resource would fault after vxfen configuration mismatch. Hence on cluster without CoordPoint resource to monitor IO fencing disks, the mismatch may go unnoticed and if a fencing race occurs with the affected node as racer then the affected node and the sub-cluster it represents would lose the race and panic.

Also, please note for SFCFS clusters with I/O shipping enabled, the shared diskgroups on the affected node would stay online even during storage outage as I/O’s would be serviced through other nodes.

State of dmp and vxfen before losing access to the storage and IO shipping enabled

# vxdisk –o alldgs list
DEVICE TYPE DISK GROUP STATUS
sda auto:none - - online invalid
sdb auto:none - - online invalid
sdc auto:cdsdisk - (fendg) online
sdd auto:cdsdisk - (fendg) online
sde auto:cdsdisk - (fendg) online
sdf auto:cdsdisk sdf sharedg online shared
sdg auto:cdsdisk sdg sharedg online shared
sdh auto:cdsdisk sdh sharedg online shared
sdi auto:cdsdisk - - online
sdj auto:none - - online invalid
sdk auto:none - - online invalid

# vxfenconfig -l

I/O Fencing Configuration Information:
======================================

Single Disk Flag : 0
Count : 3
Disk List
Disk Name Major Minor Serial Number Policy
/dev/vx/rdmp/sdd 201 128 HITACHI 750406380149 dmp
/dev/vx/rdmp/sde 201 96 HITACHI 750406380185 dmp
/dev/vx/rdmp/sdc 201 144 HITACHI 750406380148 dmp

State of dmp when lost access to the storage and IO shipping enabled

# vxdisk –o alldgs list
DEVICE TYPE DISK GROUP STATUS
sda auto:none - - online invalid
sdb auto:none - - online invalid
sdf auto:cdsdisk sdf sharedg online shared lmissing
sdg auto:cdsdisk sdg sharedg online shared lmissing
sdh auto:cdsdisk sdh sharedg online shared lmissing

When the storage is restored, OS could assign different minor number for sd disk and dmp has different minor no. for the dmpnodes. But vxfen still referencing the old minor no. used during startup.
# vxdisk –o alldgs list
DEVICE TYPE DISK GROUP STATUS
sda auto:none - - online invalid
sdb auto:none - - online invalid
sdc auto:cdsdisk - (fendg) online
sdd auto:cdsdisk - (fendg) online
sde auto:cdsdisk - (fendg) online
sdf auto:cdsdisk sdf sharedg online shared
sdg auto:cdsdisk sdg sharedg online shared
sdh auto:cdsdisk sdh sharedg online shared
sdi auto:cdsdisk - - online
sdj auto:none - - online invalid
sdk auto:none - - online invalid

$ vxfenconfig -l

I/O Fencing Configuration Information:
======================================

Single Disk Flag : 0
Count : 3
Disk List
Disk Name Major Minor Serial Number Policy
/dev/vx/rdmp/sdj 201 128 HITACHI 750406380149 dmp
/dev/vx/rdmp/sdf 201 144 HITACHI 750406380148 dmp
/dev/vx/rdmp/sdi 201 96 HITACHI 750406380185 dmp

Welcome to "KB Articles"