Cluster node fails to join cluster, goes to REMOTE_BUILD state but doesn't reach RUNNING

book

Article ID: 100025280

calendar_today

Updated On:

Description

Error Message

a) Veritas High Availability engine service starts and GAB membership attained

GAB INFO  V-15-1-20036 Port h gen 63d1da membership 01.

b) the second node logs that GAB-HAD heartbeat is not occurring, for example the following appear until the timeout setting:

GAB WARN V-15-1-20057 Port h process inactive 7 sec

c) GAB terminates HAD on the second node

 GAB INFO V-15-1-20032 Port h closed

d) service control manager says that the "Veritas High Availability engine terminated unexpectedly"

 

Cause

It may be possible in this scenario that there is a faulty LLT link, either the NIC card or network switch or patch cable may be introducing transient network errors.  This may be stopping HAD from communicating between the nodes such as when the second node must build its configuration from another node, or causing GAB membership changes such that GAB must terminate HAD

Resolution

Review the output from command prompt "lltstat" to see if there are any LLT errors.  Troubleshoot LLT network components for any errors.

 

 

Issue/Introduction

A cluster node starts and attempts to join a node where HAD (Storage Foundation for Windows High Availbility Daemon) is already in the RUNNING state.  The second node starting is unable to join the cluster, the GAB service logs on that node that it joins the cluster membership and Service Control Manager logs that the Veritas High Availability engine successfully entered the running state. However the second node is unable to GAB-HAD heartbeat and GAB immediately starts logging process inactive, up until the timeout, at which point it will log "port h closed" which means the cluster engine was terminated. In terms of the cluster state, the engine on the second node moves into REMOTE_BUILD state but never reaches the RUNNING state.  Note that the service "Veritas High Availability Engine" running is not the same as the engine state RUNNING.