Using a very large cacheobject can result in silent corruption of snapshots or i/o hang on source volume

book

Article ID: 100027698

calendar_today

Updated On:

Cause

The issue occurs when there is an integer overflow within an arithmetic calculation inside cacheobject code. This happens when the cacheobject exceeds a certain size.

The size of the cacheobject required  to hit the issue depends on the region size used within the cacheobject. The following table shows cacheobject sizes affected:

region size cachobject size
32k  2TB
64k  5TB
128k 10TB
256k 20TB
512k  41TB

To show current region size in sectors:

vxprint -mg | grep regionsz

Sector size varies between OS platform. For example, on Solaris the following shows a region size of 512 sectors, or 256KB

# vxprint -mg testdg cacheobject | grep regionsz

        regionsz=512

 

 

Resolution

As this issue may have caused silent corruption within snapshot volumes using the affected cacheobjects, it is recommended to delete any snapshot volumes that may be affected and recreate them after correcting the issue.

The issue is targeted to  be fixed at the following versions:

5.1SP1RP3P2

6.0.3

Workarounds:

1. Delete the cacheobject and recreate it smaller than the size listed in the table above.

2. Due to the way cacheobject's work, the issue can be avoided by creating a cache object smaller than the affected size. Then grow the object in increments smaller than the affected size.

The following example shows the deleting snapshots and cacheobject, then recreating the objects again. In this example, we are using a region size of 256KB. In order to avoid this issue, we create the cacheobject with a size of 10TB and increase it's size in 10TB increments.

 

Delete snapshots and cacheobject:

 

1. stop using snapshot (i,e, unmount any filesystems using them etc)

2. remove snapshot volumes

vol=`vxprint -g testdg -vne "v_plex.pl_subdisk.sd_dm_name ~ /cacheobj/"`

for i in `echo $vol`

do

  vxedit -g testdg -rf rm $i

done

3. remove cache object

 vxcache -g testdg stop cacheobj

vxedit -g testdg -rf rm cacheobj

 

recreate cacheobject:

 

1. Create cacheobject:

vxassist -g testdg make cache_vol 10000g 

vxmake -g testdg cache cacheobj cache_vol regionsize=256k

vxcache -g testdg start cacheobj

 

2. grow cacheobject:

vxcache -g testdg growcacheby cacheobj 10000g

vxcache -g testdg growcacheby cacheobj 10000g

 

3. recreate snapshot:

vxsnap -g testdg make source=source_vol/newvol=snapshot_vol/cache=cacheobj

 

 

Applies To

 This issue affects VxVM version 5.1SP1 and 6.0.1 on all supported platforms.

Issue/Introduction

An issue has been discovered within Veritas Volume Manager (VxVM) where the use of a very large cacheobject can result in either silent corruption of snapshot volumes that use the cacheobject or i/o hangs on the source volume.