SCSI-3 PR operation fails intermittently on some newly added LUNs from HDS array

book

Article ID: 100022730

calendar_today

Updated On:

Description

Error Message

 
After initializing the newly allocated LUNs from HDS TagmaStore Array, vxdg adddisk to Shared Diskgroup fails for some LUNs with following error:
 
# vxdg -sg sdg adddisk sdg14=tagmastore-usp0_2644
 
VxVM vxdg ERROR V-5-1-10128 SCSI-3 PR operation failed
 

 

Cause

As per HDS support (HDS Document ID HDS64593), the array would start logging "d4dd"  which can be seen from USPV dump. HDS USP-V (and previous Generation Arrays) have a specification on the maximum number of RSV-Key allowed per physical port. Each array port can support up to 128 unique RSV Keys. VCS/CFS with I/O Fencing enabled, can register one key per diskgroup per path. When host running SFCFS attempts to register a PGR key and if it would exceed 128 unique key limit on the array port, the array will return a failed status for the registration and log SSB D4DD in the array log.
 
(Number of Disk group) x (Number of Nodes attached to the array port) = number of keys register per port
 

How to list the PGR keys for all disks from one of the SFCFS node in the cluster?
 
First create a file with all the SAN devices through each controller. Assuming c5 and c6 are the two controllers, create a file as follows:
 
# ls -1 /dev/rdsk/c[56]*s2 > /tmp/c5_c6_disks
 
Use this file to list the keys as follows
 
# /sbin/vxfenadm -g all -f /tmp/c5_c6_disks
 
Please note that vxfenadm command will list all the keys registered by all nodes in the cluster. If each LUN have 4 paths per node and if there are 2 nodes in the cluster, you will see 8 PGR keys for each LUN. The PGR are unique for each diskgroup and each node.
 
Node A for one diskgroup would register APGR0001 key for all LUNs in that diskgroup. Node B would register BPGR0001 key for the same diskgroup for all LUNs in that diskgroup.
 
So, in a 8 node CVM/RAC cluster with more that 15 shared diskgroups with all the nodes using the same Array port for all the LUNs would encounter this issue as the total number of unique keys registered would exceed 128 for that array port.
 
8 nodes * (15 shared diskgroup + 1 vxfencoordinator diskgroup) = 128 Unique PGR key
 

 

Resolution

Customers to contact Storage support  and verify if they see the SSB D4DD message logged for any of the array port for the LUN that reported SCSI-3 PR operation failed message from Host.
 
As a workaround allocate new LUNs from different array port which did not exceed the limitation if D4DD errors are logged.
 
Please contact HDS Techsupport or Hitachi RSD support for the possibilities of increasing the 128 PGR key limit on the Array.
 
NOTE: Turning ON HDS Host Mode option (HMO 61) allows up to 2048 keys per array port. 
 

 

 

 

 

Issue/Introduction

SCSI-3 PR operation fails intermittently on some newly added LUNs from HDS array