"VCS ERROR V-16-20006-1005 (hostname) CVMCluster:cvm_clus:monitor:node - state: out of cluster" appears while adding a node back into a cluster

book

Article ID: 100012265

calendar_today

Updated On:

Description

Error Message

Error message after running hastart:

2014/03/27 16:02:58 VCS ERROR V-16-20006-1005 (hostname) CVMCluster:cvm_clus:monitor:node - state: out of cluster
reason: SCSI-3 PR operation failed: retry to add a node failed

 

Cause

This error may occur if /etc/vx/volboot is set to use SCSI-3 PR (Persistent Reservations) with an array that does not support SCSI-3 PR. The default setting for /etc/vx/volboot is to use SCSI-3 PR.

In a Veritas Cluster Server (VCS) environment, the keys on the data disks are governed by the "groupreserve" option and the "scsi3_pr" setting in the volboot file. The scsi3_pr setting needs to be consistent on all of the nodes; inconsistent settings can lead to import, or join, failures. 

 

Resolution

If the disk array does not support SCSI-3 PR, ensure that scsi3_pr is set to "off" for all nodes in the cluster. Otherwise, vxconfigd will fail to automatically import shared disks in a Cluster Volume Manager (CVM) environment.

Use vxdctl to determine the current value of scsi3_pr for a node (Figure 1). The value may also be found by manually inspecting /etc/vx/volboot, but do not manually edit this file.


Figure 1 - Checking and modifying the value of scsi3_pr


Using vxdctl to determine the value of scsi3_pr:

# vxdctl scsi3pr
scsi3pr: on


Using vxdctl to modify the value of scsi3_pr:

# vxdctl scsi3_pr off


Restarting vxconfigd:

# vxconfigd -k -x syslog  

Warning: Before restarting vxconfigd, be aware of the effect that this has on other applications that depend on this daemon. Clustering software, such as Veritas Cluster Server (VCS) may initiate a service group failover in response to a vxconfigd restart. If VCS is present, a "freeze" may be issued to prevent failovers from occurring. 

In some cases, vxconfigd may be unable to completely restart, or may even cause the system to become unstable, due to misconfigurations or existing corruption within kernel space. This may leave the server in a precarious state that requires a reboot to resolve. 

For these reasons, restarting vxconfigd should be performed during maintenance windows, or during off-peak production hours.



Note: The value of scsi3_pr should match on all nodes in the cluster.
                

 

 

Applies To

Cluster is using server-based fencing with a Coordination Point Server (CPS).
 

Issue/Introduction

An error occurs when attempting to add a node back into a cluster after the node has been reinstalled.

Additional Information

UMI: V-16-20006-1005