VCS high availability daemon (HAD) died and was restarted by HASHADOW

book

Article ID: 100000605

calendar_today

Updated On:

Description

Error Message

From syslog(*) file:

GAB INFO V-15-1-20041 Port h: client process failure: killing process
GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure

From hashadow_A.log file:

VCS WARNING V-16-1-11277 Retry Interval is 10 at fault 1
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart

(*) Note: on AIX systems the GAB warnings will be logged into into the /var/adm/streams/error.MM-DD (MM: Month 1-12, DD: Day 1-31)

Cause

HAD restart is seen on systems which run under heavy load conditions. When the HAD daemon stops sending heartbeat messages to GAB for longer than VCS_GAB_TIMEOUT (by default set to 15000 milliseconds), GAB attempts to kill HAD or panic the system, per configuration. The HASHADOW process will restart the HAD daemon to start sending heartbeat messages to GAB on regular basis.

Resolution

Check system health to determine if there are any resource issues which contribute to need for HAD restart sequence.

No actions need to be taken from the administrator side unless the below messages are seen in /var/VRTSvcs/log/hashadow_A.log and both the had and hashadow daemons cannot be found from the process list.

VCS WARNING V-16-1-11277 Retry Interval is 10 at fault 1
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart
VCS WARNING V-16-1-11277 Retry Interval is 20 at fault 2
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart
VCS WARNING V-16-1-11277 Retry Interval is 30 at fault 3
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart
VCS WARNING V-16-1-11277 Retry Interval is 40 at fault 4
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart
VCS WARNING V-16-1-11277 Retry Interval is 50 at fault 5
VCS NOTICE V-16-1-11107 Restarting VCS with following arguments
 /opt/VRTSvcs/bin/had   -restart

Check the engine_A.log for details on why HAD was unable to start.

Issue/Introduction

VCS high availability daemon (HAD) died and was restarted by HASHADOW