When a system panics, VCS fails to fail over the IP resource to another system

book

Article ID: 100004360

calendar_today

Updated On:

Description

Error Message

2010/12/22 13:58:52 VCS INFO V-16-1-10493 Evaluating nairn as potential target node for group Vertica_169-26-4-128_SG
2010/12/22 13:58:52 VCS NOTICE V-16-1-10301 Initiating Online of Resource Vertica-169-26-4-128-IP (Owner: unknown, Group: Vertica_169-26-4-128_SG) on System nairn
2010/12/22 13:58:53 VCS WARNING V-16-10031-4604 (nairn) IP:Vertica-169-26-4-128-IP:online:Address 169.26.4.128 already exists: Res Vertica-169-26-4-128-IP will not go online.
2010/12/22 14:00:54 VCS ERROR V-16-2-13066 (nairn) Agent is calling clean for resource(Vertica-169-26-4-128-IP) because the resource is not up even after online completed. ...
2010/12/22 14:00:55 VCS ERROR V-16-1-10205 Group Vertica_169-26-4-128_SG is faulted on system nairn
2010/12/22 14:00:55 VCS NOTICE V-16-1-10446 Group Vertica_169-26-4-128_SG is offline on system nairn
 

Cause

VCS attempts to online SG before OS has taken down IP on I/O Fenced node.

Resolution

Increase the value of the OnlineRetryLimit attribute for the IP resource type.
# haconf -makerw
# hatype -modify IP OnlineRetryLimit 5
# haconf -dump -makero
# hatype -display IP | grep -i OnlineRetryLimit      --> to verify the setting
 


Issue/Introduction

When a system panics, the IP address remains plumbed to the system for a while. In such a case, VCS may not succeed in failing over the IP resource to another system. This can be observed when a system panics during I/O Fencing. As a result the following error can be seen: VCSAG_LOG_MSG("W", "Address $Address already exists: Res $ResName will not go online.", 4604, "$Address", "$ResName");