VCS WARNING V-16-10001-19526 (unix1) LDom:prd2_ldom:monitor:The guest operating system is heavily loaded.
Then the LDom is reported as being offline.
VCS ERROR V-16-2-13067 (unix1) Agent is calling clean for resource(LDOM_prd02) because the resource became OFFLINE unexpectedly, on its own.
This behaviour is as designed to failover or restart LDom's if they are hung with 100% CPU usage.
If it is undesirable for VCS to offline the LDom's, then set MonitorCPU attribute to 0 for the resource
# haconf -makerw
# hares -modify prod2_ldom MonitorCPU 0
# haconf -dump -makero
Alternately, change the default behaviour for all LDom resources
# haconf -makerw
# haattr -default LDom MonitorCPU 0
# haconf -makero
Also, if there is a desire to reduce the sensitivity to CPU usage reaching 100%, then increase the ToleranceLimit attribute.
# hatype -modify LDom ToleranceLimit 2
That will delay the Agent to declare resource as faulted by 2 consecutive monitor cycles of LDom CPU is 100% busy.
Please note, this increase in ToleranceLimit also will delay the resource failover in actual fault conditions. So an appropriate
value, suiltable for local cluster requirement, should be choosen for ToleranceLimit Attribute.
Applies To
VCS on LDOM environments.