Thin reclamation functionality can cause disk label loss, or private/public region corruption, when using Storage Foundation 6.0.3 on the Solaris SPARC platform

book

Article ID: 100010150

calendar_today

Updated On:

Cause

Storage Foundation supports various disk formats like CDS, Sliced, and Simple etc. Depending upon disk formats, private region and public region can reside on same disk partition or different disk partition. Also, these partitions may start from beginning of the disk (Solaris S2 partition) or somewhere after beginning (Solaris S3, S4 partitions).

While reclaiming the regions through VxVM commands, in scenarios described earlier (Case 2), we are using offsets with respect to partition on which public region lies. But in thin reclaim IOCtl these offsets are treated with respect to disks.

E.g.

For sliced disk on Solaris, suppose private region lies in partition 3 and public region lies in partition 4. And partitions 3 and 4 have x and y offsets with respect to disk. Now if we create first volume on these disks with init=zero option then subdisk will start from beginning of the partition 4. For thin reclaimable disks, VxVM will reclaim this region. Here sub disk’s offset with respect to partition is 0 so we were passing offset 0. Actually we should be passing offset as x with respect to the disk.

Because of this we will reclaim region from offset 0 and disk label will get lost.

 

Similarly, due to wrong offset calculations, genuine data blocks, private region contents may also get reclaimed causing data loss.

 

 

 

 

Resolution

Frequency of occurrence and affected disks:

Any disk format where public region sits on different partition or public region partition does not start from beginning of the disk will be affected by this bug.

If disk has empty regions like sub disks are not contiguous or there is space left at the end of the disk then VxVM reclaim commands will cause this issue

If disk is completely filled then it won’t be affected by this bug, because VxVM won’t find any space to reclaim.  

 

Patch:

The problem is fixed in Veritas Veritas Storage Foundation High Availability (SFHA) Solution version 6.0.5 on all platforms.   Please upgrade to SFHA 6.0.5 to fix the  problem.

VxVM Hot-fix has also been released and created for Solaris Sparc, 6.0.3.011 ( vm-sol10_sparc-6.0.3.011) . VRTSaslapm 6.0.100.201 is required when install VxVM Hot-fix 6.0.3.011.   Please contact Veritas Support if you require this Hot-fix.

 

Probable Workaround:

Thin reclamation done through VxFS does not cause any issues. As a potential workaround use the fsadm command to reclaim freed data blocks until the recommended VxVM Hot-Fix can be installed.

In addition, Veritas would also recommend disabling the automatic thin reclamation feature.

 

- fsadm commands will perform the required reclaim activity for user data on volume.

- FS will identify unused extents and reclaim that region.

- fsadm can reclaim data irrespective of vxdisk reclaim commands.

 

We have not seen any issues with fsadm as it follows a different code path (standard I/O code path) which takes care of the offset related issue itself.

Therefore it should be safe to use the fsadm related commands in place of "vxdisk reclaim".

 

 

 


Issue/Introduction

 
Storage Foundation thin reclamation functionality can cause disk label loss or private/public region corruption in some cases.   Thin reclaimable regions can be divided into 2 major groups:
  1. Unused regions within actual data volumes
VxVM works with VxFS to reclaim the unused data blocks. There is no issue with this functionality and everything works fine.  
  1. Reclaiming sub-disk gaps and deleted objects
VxVM provides way to reclaim unused space in following scenarios: ·          GAP between sub disks ·          Space used by deleted objects (volumes/plex/sub disks) ·          Volume creation with init=zero     For the second case, VxVM directly calls array specific reclaim IOCTL. But while doing this disk offsets are not handled properly. Therefore, we wrongly reclaim genuine data blocks, disk labels etc. Veritas would advise all customers to restrain from relclaiming space in connection with sliced disks due to a potential interoperability issue with sliced disks when calculating the reclaim offset.  

Additional Information

ETrack: 3249264