How to start CVM on the slave node when a failed disk prevents "vxclustadm -m gab -t vcs startnode" from working - vxvm: vxconfigd: V-5-1-12673 import_start: disk <disk-id> not found, flags 0x808

book

Article ID: 100037897

calendar_today

Updated On:

Description

In some circumstances CVM may not start on the slave node where a disk is visible to the slave node. In such a situation the kernel content for the problematic disk may not be aligned with the diskgroup configuration database for a previously recovered disk using vxprivutil set dgname="" to recover the problematic disk.

Slave node "bashful"

# vxclustadm -m vcs -t gab startnode VxVM vxclustadm INFO V-5-2-9687 vxclustadm: Fencing driver is in disabled mode # vxclustadm -v nodestate state: out of cluster reason: Disk for disk group not found: retry to add a node failed VCS engine_A.log messagesJul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-12673 import_start: disk 1312382118.46.dopeynot found, flags 0x808 Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-11092 cleanup_client: (Disk for disk group not found) 183 Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-11467 kernel_fail_join() : Reconfiguration interrupted: Reason is retry to add a node failed (13, 0) Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-7901 CVM_VOLD_STOP command received Jul 28 14:50:28 bashful kernel: VxVM vxio V-5-0-164 Failed to join cluster QBDDTLS_CL, aborting Jul 28 14:50:28 bashful kernel: VxVM vxio V-5-3-0 joinsio_done: Node aborting, join for node 0 being failed Jul 28 14:50:28 bashful kernel: VxVM vxio V-5-3-0 abort_joinp: aborting joinp for node 0 with err 11 Jul 28 14:50:28 bashful kernel: VxVM vxio V-5-3-0 joinsio_done: Node aborting, join for node 1 being failed Jul 28 14:50:28 bashful kernel: VxVM vxio V-5-3-0 abort_joinp: aborting joinp for node 1 with err 11 Jul 28 14:50:28 bashful kernel: GAB INFO V-15-1-20032 Port v closed Jul 28 14:50:28 bashful kernel: GAB INFO V-15-1-20032 Port w closed Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-12673 import_start: disk1312382118.46.dopeynot found, flags 0x808 Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-11092 cleanup_client: (Disk for disk group not found) 183 Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-11467 kernel_fail_join() : Reconfiguration interrupted: Reason is retry to add a node failed (13, 0) Jul 28 14:50:28 bashful vxvm: vxconfigd: V-5-1-7901 CVM_VOLD_STOP command received On the Master node # vxdisk -s list > /var/tmp/vxdisk-slist.tlsqbdd3 # grep1312382118.46.dopey/var/tmp/vxdisk-slist.tlsqbdd3 diskid:1312382118.46.dopeyOn the Slave node # vxdisk -s list > /var/tmp/vxdisk-slist.tlsqbdd4 # grep1312382118.46.dopey/var/tmp/vxdisk-slist.tlsqbdd4 diskid:1312382118.46.dopey

Note: The offending diskid "1312382118.46.dopey" is visible from both the Master and Slave node.

vxreattach

In the event that the "vxreattach" command is unable to associate the disk access (da) name back into the impacted diskgroup, another approach is required.

# vxreattach -c
VxVM vxdisk ERROR V-5-1-558 Disk : Disk not in the configuration

The purpose of the "vxreattach" command is to reattach disk drives that have once again become accessible.

The vxreattach utility reattaches (recovers) disks back into the impacted diskgroup they were associated with, retaining the same disk media name.
The utility attempts to locate a disk in the same diskgroup with the same Veritas disk ID for the disk to be reattached.
The reattach operation may fail even after locating the disk with the corresponding disk ID, if the original case (or some other cause) for the disk failure still exists.

Scenario

The "shared" diskgroup "testdg" fails to import initially due to failed disk on the master server.

# vxdg -s import testdg
VxVM vxdg ERROR V-5-1-10978 Disk group testdg: import failed:
Disk for disk group not found

# vxdg -Cfs import testdg
VxVM vxdg WARNING V-5-1-560 Disk emc0_0281: Not found, last known location: emc0_0281

# vxdisk -eg testdg list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
emc0_0280    auto:sliced    emc1_0280    testdg      online shared              c1t5006048C5368E5A0d324s2 std
-            -         emc0_0281    testdg       failed was:emc0_0281

In this instance, it is not possible to use "vxreattach" to recover the failed disk back into the diskgroup "testdg".

# vxreattach -c emc0_0281
VxVM vxdisk ERROR V-5-1-558 Disk emc0_0281: Disk not in the configuration

Recovery procedure

To obtain the Veritas diskgroup (dg) id from the impacted diskgroup, type:

# vxdg -q list | grep testdg
testdg       enabled,shared,cds              1311240633.41.dopey

To obtain the disk attribute from the diskgroup configuration database, type:

# vxprint -g testdg -dF'%last_da_name %name %diskid'
emc0_0281 emc0_0281 1312382118.46.dopey
emc0_0280 emc1_0280 1311240596.39.dopey

To cross match the above diskid "1312382118.46.dopey" for the impacted Veritas disk access name "emc0_281" from the diskgroup configuration database to that of the VxVM kernel disk content, type:

Syntax:

# vxdisk -x DISKID -x DGID -p list | grep

In this instance, emc0_0281 has the same diskid reported from both outputs, confirmation that this is the same disk.

# vxdisk -x DISKID -x DGID -p list | grep 1311240633.41.dopey
emc0_0280    1311240596.39.rdgv240sol13 1311240633.41.dopey
emc0_0281    1312382118.46.rdgv240sol13 1311240633.41.dopey    <<<<< this is the failed disk

By clearing the diskgroup name with vxprivutil may not be sufficient for shared diskgroup, so the following process is recommended.

# vxdisk list emc0_0281
Device:    emc0_0281
devicetag: emc0_0281
type:      auto
hostid:    dopey
disk:      name= id=1312382118.46.dopey
group:     name=testdg id=1311240633.41.dopey
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/emc0_0281s2 char=/dev/vx/rdmp/emc0_0281s2
guid:      {cf7741f4-bddd-11e0-bccf-0003baa707e3}
udid:      EMC%5FSYMMETRIX%5F000290300822%5F2200281000
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=4037248 disk_offset=0   <<<<<<<< offset
private:   slice=2 offset=256 len=65536 disk_offset=0 <<<<<<<<< attributes
update:    time=1312385647 seqno=0.13
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
lockrgn priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   2
c1t5006048C5368E580d334s2       state=enabled
c1t5006048C5368E5A0d325s2       state=enabled

To reset the VxVM disk header content for the problematic disk, be careful when specifying the offset attributes, example shown below relating to the above disk:

# vxdisk -f init emc0_0281 privoffset=256 privlen=65536 puboffset=65792 publen=4037248 format=cdsdisk

Note: Once the disk header content has been cleared, the disk can be associated back with the impacted diskgroup.

Using the "-k" option with vxdg adddisk, it is possible to associate the failed disk back into the impacted diskgroup using the existing disk media (dm) name and corresponding da name:

# vxdg -g testdg -k adddisk emc0_0281=emc0_0281

Note: If the vxdisk -f init command is not performed, the following message may be reported:

# vxdg -g testdg -k adddisk emc0_0281=emc0_0281
VxVM vxdg ERROR V-5-1-10128   Record not in disk group

# vxdisk -eg testdg list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
emc0_0280    auto:sliced    emc1_0280    testdg      online shared              c1t5006048C5368E5A0d324s2 std
emc0_0281    auto:cdsdisk   emc0_0281    testdg      online shared             c1t5006048C5368E580d334s2 std

Now the disk has been recovered back into the diskgroup, the volumes can then be recovered.

Now the disk has been recovered correctly on the Master node, CVM can be started once again on the Slave node.

# vxclustadm -m gab -t vcs startnode

Note: The process may require that the shared diskgroup to be deported all nodes which form the CVM cluster, for the correct disk recovery procedure to be followed using the "vxdisk -f init" approach documented above.

Issue/Introduction

How to start CVM on the slave node when a failed disk prevents "vxclustadm -m gab -t vcs startnode" from working - vxvm: vxconfigd: V-5-1-12673 import_start: disk not found, flags 0x808

Was this article helpful?

thumb_up Yes

thumb_down No

Welcome to "KB Articles"