Possible unexpected extended service interruption due to loss of access to data when shutting down Veritas Cluster Server (VCS) when both Cluster Volume Manager (CVM) and non-shared disk group(s) are in the configuration

book

Article ID: 100006297

calendar_today

Updated On:

Description

Error Message

Errors similar to the following will be seen in syslog:

nodeName vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-583 Disk group DG_Name: Reimport of disk group failed:
nodeName vxvm:vxconfigd: [ID 702911 daemon.error]         Disk group has no valid configuration copies
nodeName vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-579 Disk group DG_Name: Errors in some configuration copies:
nodeName vxvm:vxconfigd: [ID 702911 daemon.warning]   Disk c2t24d32s2, copy 1: Block 1: Duplicate record in configuration
nodeName vxvm:vxconfigd: [ID 702911 daemon.warning]   Disk c2t24d41s2, copy 1: Block 1: Duplicate record in configuration
nodeName vxvm:vxconfigd: [ID 702911 daemon.warning]   Disk c2t24d40s2, copy 1: Block 1: Duplicate record in configuration
 

 

Stack backtrace of vxconfigd core file:

#0  0x00092d20 in dbf_fmt_tbl ()
#1  0x0012d680 in voldbf_fmt_tbl ()
#2  0x0012cb6c in voldbsup_format_record ()
#3  0x000907a8 in format_write ()
#4  0x00090284 in ddb_update ()
#5  0x000a48d0 in dg_update ()
#6  0x000a056c in req_dg_flush_common ()
#7  0x000a0134 in req_dg_flush_name ()
#8  0x00109ed0 in request_loop () 

 

Cause

An error in the VxVM configuration and administration daemon, vxconfigd, can cause a partially created record of state change to not be 'backed out of' and 'cleaned up in' the configuration database when the state change transaction is aborted by the CVM reconfiguration processing.  This issue does not occur if the entity is part of a shared (cluster) disk group.

Resolution

This issue is no longer present in VxVM versions 5.1SP1 and later.  It is recommended for all customers to upgrade to the latest release of VxVM 5.1SP1 to avoid any known issues that have already been corrected in current releases.

In those situations where you feel you are at risk of encountering this issue, and an upgrade of VxVM is not possible, it is suggested that you alter your VCS configuration to prevent the simultaneous offlining of CVM and any volume resources on private disk groups.  This may be accomplished by adding a resource or service group dependency with your CVM resource or service group as the child of the dependency.

 

 

Issue/Introduction

There is an issue in Veritas Volume Manager (VxVM) versions prior to 5.1SP1 which can cause corruption of disk group meta-data and thus prevent the importation of disk groups. To encounter this issue requires the specific timing of a sequence of events which has a low probability of occurring. However, this sequence of events can rarely occur during the automated processes used by VCS to shut itself down. If an entity within a private (non-cluster) disk group is being created or deleted or otherwise having it's state changed, such as a volume being stopped or started, and that action is aborted by VxVM because of a CVM reconfiguration event, followed by another state change for the same entity, then the VxVM configuration database for that disk group can become corrupted with duplicate records. If this occurs, then VxVM will no longer be able to properly process the disk group's configuration and therefore will not be able to import it - causing loss of access to all the data in that disk group until the configuration database is rebuilt. The data contained in that disk group is not affected, it is just not available. Additionally, the primary administration daemon for VxVM (vxconfigd) will crash if it attempts to 'flush' the corrupted configuration to the disks containing the configuration copies for the disk group. This sequence of events can occur during the normal shut down of a VCS cluster that contains both CVM and one or more volume resources in the cluster configuration. However, due to the specific timing required to trigger the issue, it is unlikely to occur even when specifically attempting to trigger it.

Additional Information

UMI: V-5-1-583 UMI: V-5-1-579 ETrack: 530741