DMP enhancement to handle failing IO when it is not able to interpret sense data( without fix possible side effects like dgdisabled diskgroups, disk failures)

book

Article ID: 100023280

calendar_today

Updated On:

Resolution

DMP is very aggressive when failing IO when it is not able to interpret the sense data.

Without the below enhancement, diskgroup's may go into a "DGDISABLED" state due to what appears to be a SAN outage.

Sample messages

/etc/vx/dmpevents.log

Tue Aug 25 19:26:07.449: SCSI error occured on Path c7t5006048AD52DC5C9d39s2: opcode=0x12 reported scsi transport error (status=0x4, key=0x0, asc=0x0, ascq=0x0)
Tue Aug 25 19:26:07.449: SCSI error occured on Path c7t5006048AD52DC5C9d38s2: opcode=0x12 reported scsi transport error (status=0x4, key=0x0, asc=0x0, ascq=0x0)
Tue Aug 25 19:26:07.449: SCSI error occured on Path c7t5006048AD52DC5C9d34s2: opcode=0x12 reported scsi transport error (status=0x4, key=0x0, asc=0x0, ascq=0x0)
Tue Aug 25 19:26:07.449: SCSI error occured on Path c7t5006048AD52DC5C9d14s2: opcode=0x12 reported scsi transport error (status=0x4, key=0x0, asc=0x0, ascq=0x0)
Tue Aug 25 19:26:07.449: SCSI error occured on Path c7t5006048AD52DC5C9d116s2: opcode=0x12 reported scsi transport error (status=0x4, key=0x0, asc=0x0, ascq=0x0)


The issue is that the SCSI inquiry status encountered cdb status STATUS_MET (0x04) which is not currently handed without the enhancement, hence the error analysis treats
the error as a "generic device error".


DMP_NODE_ERR_EVENT

Media error occured on  Dmpnode

This event is raised by DMP when it detects a media error, i.e. the connectivity to the LUN is fine but the I/O operation could not be completed.
The administrator needs to check the status of the LUN on the array.

Other messages

Tue Aug 25 21:43:20.793: I/O error occured on Path c3t5006048AD52DC5C6d114s2 belonging to Dmpnode c3t5006048AD52DC5C6d114s2
Tue Aug 25 21:43:20.799: Disabled Path c3t5006048AD52DC5C6d114s2 belonging to Dmpnode c3t5006048AD52DC5C6d114s2
Tue Aug 25 21:43:20.803: Disabled Dmpnode c3t5006048AD52DC5C6d114s2
Tue Aug 25 21:43:20.805: I/O analysis done on Path c3t5006048AD52DC5C6d114s2 belonging to Dmpnode c3t5006048AD52DC5C6d114s2
Tue Aug 25 21:43:20.816: I/O error occured on Path c3t5006048AD52DC5C6d111s2 belonging to Dmpnode c3t5006048AD52DC5C6d111s2


Note: Unfortunately the code was too complex to backport to 5.0 MP1 due to numerous potential code regression possibilities. 


Patch Information:

PRODUCT VERSION                       PLATFORM          TYPE            PATCH NAME
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Veritas Volume Manager 5.0MP3   Solaris  x86           RP                 vm-sol_x64-5.0MP3RP2
Veritas Volume Manager 5.0MP3   Solaris                   RP                 vm-sol_sparc-5.0MP3RP2
Veritas Volume Manager 4.1MP2   Solaris                   RP                 vm-sol_sparc-4.1MP2RP4
Veritas Volume Manager 4.1MP2   Solaris                   HF                  vm-sol_sparc-4.1MP2RP3HF17
Veritas Volume Manager 4.1MP2   Solaris                   HF                  vm-sol_sparc-4.1MP2RP3HF16


LINUX:

On Linux, the DMP enhancement is addressed in a hot-fix for all the 5.0 MP3 RP branches.
5.0 RU3 also includes the update, although 5.0 MP4 is recommended for Linux.


AIX:
The DMP enhancement is included in 50MP3RP1 onwards, however, 5.0 MP3 RP4 and above is recommended.
 


Issue/Introduction

DMP enhancement to handle failing IO when it is not able to interpret sense data (without the patch fix possible side effects include dgdisabled diskgroups, disk failures...etc.)