Storage array migration or mirroring, done at the array level must have the same LUN geometry or it can break VxVM cdsdisk technology and lead to silent data corruption.

book

Article ID: 100006697

calendar_today

Updated On:

Cause

The corruption will happen only when VxVM needs to check that the labels are written at the correct offset of the device by doing a scsi inquiry and determining the correct LUN geometry values to use. This is done through commands like, vxdisksetup, vxdg flush , and vxdisk flush.  If you are using VCS command like hagrp offline and clean will also execute a vxdg flush. 

This will result in Symantec backup labels getting written to the public region of subdisk, causing silent data corruption.

The following is a description of the problem according to the Etrack incident listed in the Supplemental Material section.

SYMPTOM:
When VxVM is upgraded from 4.1MP4RP2 to 5.1SP1RP1, the data on CDS disk gets corrupted.

DESCRIPTION:
When CDS disks are initialized with VxVM version 4.1MP4RP2, the no of cylinders are calculated based on the disk raw geometry. If the calculated no. of
cylinders exceed Solaris VTOC limit (65535), because of unsigned integer overflow, truncated value of no of cylinders gets written in CDS label. After the VxVM is upgraded to 5.1SP1RP1, CDS label gets wrongly written in the public region leading to the data corruption.

RESOLUTION:
The code changes are made  to suitably adjust the no. of tracks & heads so that the calculated no. of cylinders be within Solaris VTOC limit.

Resolution

1)      Veritas recommends the customer not to run the following commands as corruption may occur if all the conditions listed in the Environment section of this article are met.

1.      v xdg flush
2.      v xdisk flush
3.      V CS DiskGroup Offline Agent
4.      VCS DiskGroup Clean Agent
 
2)      Upgrade to 5.1SP1RP2 as the VxVM code has been enhance to pre-detect this problem and work around it. 
 
3)      Contact Veritas Support
 

Applies To

Data Corruption can happen when the all following conditions on the LUN are met.

VxVM DM (disk media) disk size must be greater than raw geometry or the calculated backup label location.
 
LUN Storage Migrations example:
 
Initial vxdisksetup of LUN/disk used these scsi inquiry values.
 
EMC-SYMMETRIX-5771 cyl 9293 alt 2 hd 15 sec 128
EMC SYM disk capacity size = 17846400 
            Calculated VxVM CDSdisk format, DM (Disk Media) size =17842560
 
After storage migration, the new scsi inquiry values used.
 
HITACHI-OPEN-V-7002 cyl 2321 alt 2 hd 15 sec 512
HITACHI LUN disk capacity size = 17840640
 
The computed location of the backup label is less than VxVM disk media Size
9293*128*15 = 17842560 (VxVM DM size due to previous storage scsi inquiry values )
(2323*512*15)-511 = 17840129 (Raw geometry backup label location using current scsi inquiry values)
2323*512*15 = 17840640 (Raw Disk Capacity size due to current scsi inquiry values)
 
Since VxVM DM size is 17842560 which is greater than 17840129 backup label location, silent corruption can happen.

Issue/Introduction

When using storage level migration or mirroring it involves block by block storage copy of the LUNs/disks . This type of block by block copy does not take into account disk/LUN geometry.    Veritas cdsdisk format (CDS=Cross Data Sharing) uses disk geometry to allow it to migrate LUNs from one OS to another OS, ie {AIX- HP- Solaris – Linux} . In order for this to work Veritas reserves blocks close to the end of the LUNs/disks and writes multiple backup labels, pertaining to each OS (Operating System) in those blocks. This is done by using the LUN’s disk geometry gathered through a scsi inquiry.
There is a potential risk of silently corrupting a few data blocks towards the end of the VxVM CDS disks, if the scsi inquiry values returned are different,  than the original scsi inquiry values used when the LUN was first initialized as cdsdisk format   
 

Additional Information

ETrack: 2513101