Veritas Volume Manager FastResync commands hang because of deadlock in the paging module

book

Article ID: 100017091

calendar_today

Updated On:

Description

Conditions under which the issue occurs

The deadlock situation will occur when Volume Manager needs to handle FastResync instant snapshot operations but the VxVM Paging Module memory allocated is not enough. FastResync instant snapshot operations are operations like "vxsnap reattach".

In some old versions of Veritas Volume Manager, the default value of volpagemod_max_memsz is too small. Veritas recommends to increase the volpagemod_max_memsz to 64MB when using older VxVM versions.

Notes on Solaris Platform

Note: The unit for "volpagemod_max_memsz" in /kernel/drv/vxio.conf is KB.   In the above example, 65536 means 65536 KB (not to be confused with the vxtune command syntax).  

Note: For other Veritas Volume Replicator (VVR) parameters specified in /kernel/drv/vxio.conf, e.g. vol_rvio_maxpool_sz, vol_min_lowmem_sz, vol_max_nmpool_sz, vol_max_rdback_sz, the units are in bytes (not KB).

Detailed information on "volpagemod_max_memsz" can be obtained from the vxtune manual page or the "Tuning Parameters" section of the Veritas Volume Manager Administrator's Guide.

This issue is tracked through the incidents listed in the Supplemental Material section below.
  • In VxVM 4.1 MP1 and 5.0 for Solaris, the default value of "volpagemod_max_memsz" is raised to 64 MB.
  • In VxVM 4.1 MP2 and 5.0 for Linux, the default value of "volpagemod_max_memsz" is still 1.5 MB,  and that parameter has to be increased manually to work around the problem.
     

At the moment there is an issue in setting the parameter using /kernel/drv/vxio.conf. This is bring tracked through Etrack 2024617.   Please set the parameter in /etc/system at the same time as a workaround.
 
/etc/system:
set vxio:volpagemod_max_memsz=
 
 
Notes on AIX Platform
 
For AIX systems, to change the volpagemod_max_memsz using the smitty OS command from the shell prompt, for example:
 
smitty -->  System Storage Management (Physical & Logical Storage) -->   VERITAS Volume Manager -->   Change / Show VxVM Tunables -->   FMR/cache object metadata cache size (kilobytes)   
Set it to 65536 KB

 

 

Applies To

Veritas Volume Manager with FastResync snapshots

Cause

A potential deadlock situation is found in the Veritas Volume Manager (VxVM) paging module that will cause the Volume Manager configuration daemon vxconfigd to hang. If vxconfigd hangs, it will in turn also cause the other Volume Manager commands to hang because vxconfigd cannot respond to the requests sent from the other Volume Manager commands.
 

Resolution


Increase the Volume Manager tunable parameter "volpagemod_max_memsz" to allocate enough memory for the paging module before performing FastResync instant snapshot operations.   The following is the recommended setting.

                Use this formula to calculate the minimum value  of  volpagemod_max_memsz:

                       volpagemod_max_memsz >= max(vol_size/region_size) * 3 / 8 * 2

                where  the ratio of volume size to region size is the largest
                value for all the volumes on the system.  You do not need  to
                consider the total size of all the volumes.
 
VxVM uses 3 bits for each DCO region and there are 8 bits in each byte.   Please note that some FastResync commands may need double the amount of Page Module memory, that is why the above formula has multiplication of 2 at the end.  (This is not mentioned in the vxtune(1M) manual page though.)

For example, you could change the volpagemod_max_memsz to 64MB.

1. Change "volpagemod_max_memsz" online temporarily with the following command:
 
# vxtune volpagemod_max_memsz 64M 
 
2. To make the change permanent across system reboot, add the following entry into the Volume Manager kernel configuration file /kernel/drv/vxio.conf:
 
volpagemod_max_memsz=65536;

Note: For smaller FastResync region size or bigger volume size, "volpagemod_max_memsz" may need to increase further than the default 64MB (65536).  

Check the FastResync region size of Data Change Object (DCO) or Cache Object with the following command:
 
# vxprint -g -F%regionsz

The value displayed is in 512-byte sectors. Refer to the vxsnap manual page for details.

Note: The default unit specified in the vxtune command line is in bytes. Therefore, append the unit multiplier "m" or "k" for a value of 64 MB or 65536 KB.  

Refer to the vxtune manual page for details.

 

Issue/Introduction

Veritas Volume Manager commands hang because of deadlock in the paging module

Additional Information

ETrack: 349267 ETrack: 378132 ETrack: 349288 ETrack: 2024617