Veritas Cluster Server (VCS) behavior when ManageFaults = NONE

book

Article ID: 100029449

calendar_today

Updated On:

Resolution

A resource enters the ADMIN_WAIT state when the ManageFaults attribute for the service group is set to NONE and one of the following conditions are met:

0: The offline entry point did not complete within the expected time.
1: The offline entry point was ineffective.
2: The online entry point did not complete within the expected time.
3: The online entry point was ineffective.
4: The resource was taken offline unexpectedly.
5: The monitor entry point consistently failed to complete within the expected time.

When VCS sets a resource in the ADMIN_WAIT state, it calls the /opt/VRTSvcs/bin/triggers/resadminwait as follows:

/opt/VRTSvcs/bin/triggers/resadminwait

Where is the system where the resource went into the ADMIN_WAIT state, is the name of the resource in the ADMIN_WAIT state, and is the numerical value from the table above describing the condition which caused the ADMIN_WAIT state.

To clear the ADMIN_WAIT state, the following steps must be taken (either by administrative intervention or inside the resadminwait trigger):

1. Taking the necessary actions outside VCS to resolve the condition which induced the ADMIN_WAIT state.

2. Verifying that the condition is resolved by issuing the command:
# hagrp -clearadminwait -sys

Where is the group to which the resource in the ADMIN_WAIT state belongs, and is the system
where the resource is in the ADMIN_WAIT state.
VCS will return the resource to the previous state, at which time further commands may be issued. VCS will schedule the monitor for this resource as usual. If the condition which caused the ADMIN_WAIT state has not been resolved, the resource will return to the ADMIN_WAIT state.

If resources return to the ADMIN_WAIT state, you can repeat step 1 and step 2, or issue the following command to stop VCS from setting the resource to the ADMIN_WAIT state:
# hagrp -clearadminwait -fault -sys

The "-fault" option will have the following effect:

- If the resadminwait trigger was called for reasons 0 or 1, the resource state is set as ONLINE|UNABLE_TO_OFFLINE.

- If the resadminwait trigger was called for reasons 2, 3, or 4, the resource state is set as FAULTED. VCS will treat this as a "normal" FAULT, and will propagate the FAULT up the dependency tree and/or fail over the service group according to the configuration.

Note: When resources are set as FAULTED for these reasons, the clean entry point is not called. Verify that resources in ADMIN-WAIT are in clean, OFFLINE state prior to invoking this command.

For information related to VCS automatic fault management, see the VCS Users Guide, Appendix "D", for descriptions of the following Service Group level attributes:

AutoFailOver
FaultPropagation
ManageFaults

Note:

When ManageFaults is set to NONE for the service group and resources enter ADMIN_WAIT, this state is not persistent across VCS engine or Agent restarts. On HAD or Agent restart, the monitor would detect the resource state as OFFLINE and not as in ADMIN_WAIT.

Issue/Introduction

Veritas Cluster Server (VCS) behavior when ManageFaults = NONE

Was this article helpful?

thumb_up Yes

thumb_down No

Welcome to "KB Articles"