CVM not starting on slave nodes due to missing disks.

book

Article ID: 100021442

calendar_today

Updated On:

Description

Error Message

root# vxclustadm -v nodestate
state: out of cluster
reason: cannot find disk on slave node

 

Resolution

On a SFCFS cluster, if a node is attempting to join as a slave, but is failing with the below error:
 
root#vxclustadm -v nodestate
state:out of cluster
reason:cannot find disk on slave node
 
check the number of shared disks seen by the CVM master as compared to the CVM slave:
 
root#vxdctl -c mode
mode:enabled: cluster active - MASTER
master:nodeA
 
root@MASTER#vxdisk -o alldgs list | egrep shared | wc -l
129
 
root@SLAVE#vxdisk -o alldgs list | egrep shared | wc -l
119
 
To identify which groups contain missing disks, save a list of the disks containing shared disk groups from each node and compare.
 
On each node:
 
root#vxdisk -o alldgs list | egrep shared | awk '{print $4}' >/tmp/$node.shared_disks
 
Note: On the SLAVE node(s) it is necessary to strip out the '(' and ')' that surround the disk group name in parentheses because the disk group is in a deported state on the SLAVE nodes. In the 'vi' editor, the following commands will affect the changes:
 
:%s/(//g
:%s/)//g
 
Sort the output files:
 
root#cat /tmp/$node.shared_disks | sort >/tmp/$node.shared_disks.sorted
 
Copy the $node.shared_disks.sorted files to a common location and use the 'diff' command to identify differences:
 
root#diff nodeA.shared_disks.sorted nodeB.shared_disks.sorted
 
On the MASTER:
 
root@nodeA#vxdisk -o alldgs list | egrep sample_shared_dg
c5t0d0s2auto:cdsdisk d0 sample_shared_dg online shared
c5t0d1s2auto:cdsdisk d1 sample_shared_dg online shared
c5t0d2s2auto:cdsdisk d2 sample_shared_dg online shared
c5t0d3s2auto:cdsdisk d3 sample_shared_dg online shared
c5t0d4s2auto:cdsdisk d4 sample_shared_dg online shared
c5t0d5s2auto:cdsdisk d5 sample_shared_dg online shared
c5t0d6s2auto:cdsdisk d6 sample_shared_dg online shared
 
On the SLAVE:
 
root@nodeB#vxdisk -o alldgs list | egrep sample_shared_dg
c5t0d0s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d1s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d2s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d3s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d4s2auto:cdsdisk - (sample_shared_dg) online
c5t0d5s2auto:cdsdisk - (sample_shared_dg) online
c5t0d6s2auto:cdsdisk - (sample_shared_dg) online
 
The output indicates that the SLAVE does not see the same disk information as the master.To rectify this, flush the disk group headers for the disk group from the MASTER node and re-scan the device tree on the SLAVE node(s):
 
MASTER:
 
root@nodeA#vxdg flush sample_shared_dg
 
SLAVE:
 
root@nodeB#vxdctl enable
root@nodeB#vxdisk -o alldgs list | egrep sample_shared_dg
c5t0d0s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d1s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d2s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d3s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d4s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d5s2auto:cdsdisk - (sample_shared_dg) online shared
c5t0d6s2auto:cdsdisk - (sample_shared_dg) online shared
 
Now, nodeB should be able to join the cluster as a SLAVE.
 
 

 

Issue/Introduction

CVM not starting on slave nodes due to missing disks.