DMP not re-enabling paths following controller or SP reboots and 'vxdctl enable' hanging on VxVM 6.2.1/rhel6 with Storwize arrays.

book

Article ID: 100044269

calendar_today

Updated On:

Description

Error Message

The messages file was flooded with blk_get_request failures.

eg.

kernel: VxVM vxdmp V-5-3-0 dmp_kernel_scsi_ioctl: blk_get_request failed dev 133/0x210 cmd 0x12 error code = 0

Similar messages were seen for SCSI cmds - 0x5e, 0x5f, 0xa3, 0x12,

Cause

This was due to an  OS-udev-vxdmp interaction issue.

The DMP device is a virtual device created above OS device and it relies on the OS dev rules to notify it of any changes.  In this instance OS-udev-vxdmp interaction was not behaving as expected, so that when the paths were re-activated, dmp was not made aware and so didn’t reactivate the paths.

Typically vxesd would be made aware that a device had been removed

eg.  vxesd[108002]: vxesd: Device sdbar(71/1328) is removed.

and subsequently DMP would report that a path had been disabled.

Then it would be expected that vxesd would  be made aware that a device had been added and so DMP would be made aware to re-enable the path.

eg.  vxesd[108002]: vxesd: Device sdbar(71/1328) is added.

However in this instance vxesd was not made aware of the added devices.

Additionally it was determined that during the controller/SP reboots, it can take a while for the paths to respond after the array returns. This delay was leading to blk_get_request failures and the paths being disabled and so meant that this delay had to be taken into account at the OS-side.

Resolution

With the following solution in place, vxesd recognized when a device had been removed and added and subsequently dmp would disable and re-enable the paths correctly for the Storwize storage:

1. vxesd enabled   (with VxVM udev rules files in place)  

   For the VxVM udev rules files, check to see if the following files exist:  

   /etc/udev/rules.d/40-VxVM.rules
   /etc/udev/rules.d/99-vxdmp-remove-blockdev.rules
   /lib/udev/vxvm-udev.sh
   /lib/udev/vxpath_links
 
   If not, then the following files may need to be copied into place:
 
   cp /etc/vx/vxvm-udev.rules /etc/udev/rules.d/40-VxVM.rules
   cp /etc/vx/vxvm-udev.rules.systemd /etc/udev/rules.d/40-VxVM.rules
   cp /etc/vx/vxdmp-remove-blockdev.rules /etc/udev/rules.d/99-vxdmp-remove-blockdev.rules
   cp /etc/vx/vxvm-udev /lib/udev/vxvm-udev.sh
   cp /etc/vx/vxpath_links /lib/udev/vxpath_links
 
   Then run
    /sbin/udevcontrol reload_rules

2. The Redhat scsi_mod.inq_timeout tunable needed to be tuned.

   Normally the default value of 20s for this tunable is sufficient for most cases and Redhat would only recommend     modifying it if the storage required more time to respond to the inquiry.

  The following Redhat document provides some more information on this tunable:

   https://access.redhat.com/solutions/3430351

   In this particular scenario the recommendation was to set this tunable to '70'. This can be done in two ways:  

  a. Add  scsi_mod.inq_timeout=70 to the grub line to make it permanent and reboot.
 
  b. To set this tunable dynamically, then the following command can be run:

       echo 70 > /sys/module/scsi_mod/parameters/inq_timeout

   It should be noted that this tunable is also applicable to rhel7.

3. Set the dmp recoveryoption iotimeout  to 600

  This can be done using the vxdmpadm command:

  eg.

  vxdmpadm setattr enclosure  recoveryoption=throttle iotimeout=600 

   '600' was the value that best fit with the application timeout.

Please note that these issues may also occur on VxVM 7.x and so the above solution would be applicable.

Issue/Introduction

Similar messages were seen for SCSI cmds - 0x5e, 0x5f, 0xa3, 0x12,

Additional Information

ETrack: 3952180