Regular VXIO messages stating error 80000011 followed by SCSI Bus resets

book

Article ID: 100005169

calendar_today

Updated On:

Cause

In SFW-HA 5.1 SP2, changes have been made to the VMDg agent to reduce the number of calls made into the Volume Manager (VM) components.  Following these changes, the first act of the monitor is to check the DGGuid attribute is set correctly.  If the Disk Group being queried is not readable on the local node (i.e. SCSI2 reserved on another node) then one SCSI bus reset per disk in the Disk Group can be issued as VM attempts to read the GUID.

Resolution

There are 2 possible ways to work around this issue.

Workaround 1:

If the array hardware allows SCSI3 mode support, set the VM components to use SCSI3 mode using the option in the Veritas Enterprise Administrator GUI, then restart the Veritas Storage Agent (VXVM).   This process will need to be repeated on all nodes on the cluster.

With the VM components in SCSI3 mode, the reservation is readable by the passive node so the DGGuid query succeeds.

Workaround 2:

Remove the values assigned to the DGGuid.  As the value is absent the query is skipped.

Note: it can be useful to know the DGGuid to recover from storage failures.  Therefore, it is not recommended to permanently remove the GUID.

Solution:

 A fix is available for this issue at this location: https://sort.Veritas.com/patch/detail/4763

 


Applies To

Storage Foundation HA for Windows 5.1 Service Pack 2 (SP2).

SCSI2 setting within the Volume Manager components.

DGGuid attribute set on one or more VMDg resources.

Issue/Introduction

Using Storage Foundation HA for Windows (SFW-HA) the following messages may be observed on a passive node in the cluster, or on all nodes if the configuration is Active/Active. These messages may be accompanied by some or all of the following additional symptoms:
  • Poor IO performance
  • Veritas High Availability Engine Service (HAD) terminating and restarting
  • VMDg resources faulting, going into an Unknown state or failing to probe
  • MountV resources faulting, going into an Unknown state or failing to probe
  • MountV and VMDg agents being restarted by HAD.

Additional Information

ETrack: 2271384