How to disable resource utilization messages reported by Veritas cluster Server (VCS) for CPU and Swap

book

Article ID: 100005405

calendar_today

Updated On:

Description

Error Message

root@node1 # grep CPU /var/VRTSvcs/log/engine_A.log
2010/09/15 09:40:30 VCS INFO V-16-1-50086 CPU usage on node1 is 63%
2010/09/15 09:42:30 VCS NOTICE V-16-1-50086 CPU usage on node1 is 78%
2010/09/15 15:18:56 VCS WARNING V-16-1-50086 CPU usage on node1 is 87%
2010/09/16 16:06:26 VCS CRITICAL V-16-1-50086 CPU usage on node1 is 95%
2010/10/04 15:53:54 VCS INFO V-16-1-50086 CPU usage on node1 is 68%
2010/10/04 15:54:24 VCS NOTICE V-16-1-50086 CPU usage on node1 is 76%
2010/10/10 12:28:24 VCS WARNING V-16-1-50086 CPU usage on node1 is 87%
2011/01/06 10:45:27 VCS CRITICAL V-16-1-50086 CPU usage on node1 is 100%

Resolution

1.  Error showing the HostMonLogLvl  attribute cannot be modified while VCS is running

root@node1 # haconf -makerw
root@node1 # haclus -modify HostMonLogLvl DisableHMAgent
VCS WARNING V-16-1-13336 This attribute may not be modified while VCS is running

root@node1 # hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A  node1              RUNNING              0
A  node2              RUNNING              0

root@node1 # haconf -dump -makero
 

2. Stop cluster

root@node1 # hastop -all
root@node1 # hastatus -sum

VCS ERROR V-16-1-10600 Cannot connect to VCS engine
VCS WARNING V-16-1-11046 Local system not available

3. Edit the VCS configuration file on first node and add the following line highlighted in the BOLD letters.

root@node1 # cd /etc/VRTSvcs/conf/config
root@node1 # vi main.cf

cluster myrac (
       UserNames = { admin = hqrJqlQnrMrrPzrLqo,
                z_myzone_node2 = aHIbGJhHEgHLfGGaEE,
                z_myzone_node1 = gNLjOJkQKjOHnLKjNV,
                z_facaddwlk3_zone_node2 = bopHpjKlpKoqOmpInj,
                z_facaddwlk3_zone_node1 = HMMoKKkINkKTlIKmNQ }
       Administrators = { admin }
       HacliUserLevel = COMMANDROOT 
       HostMonLogLvl = DisableHMAgent ======================= <<<< Add this line >>>>>>>>>>>>>>> 
       )

system node1 (
       )

"main.cf" 227 lines, 4203 characters

4. Verify configuration

root@node1 # hacf -verify .

5. Make sure hostmonitor agent process is killed in all nodes in cluster

root@node1 # ps -ef |grep -i host
   root  6726 29901   0 11:33:36 pts/10      0:00 grep -i host

6. Start cluster using new main.cf on node1 first

root@node1 # hastart

7. Start cluster on remaining node in cluster

root@node2 # hastart

root@node2 # hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A  node1              RUNNING              0
A  node2              RUNNING              0

8. Verify HostMonitoragent is not started after this change and no new log messages logged in engine_A.log

root@node1 # ps -ef |grep -i host
   root  9944 29901   0 11:35:43 pts/10      0:00 grep -i host

root@node2 # ps -ef |grep -i host
   root 29555 26654   0 11:35:03 pts/1       0:00 grep -i host


2011/03/07 11:33:46 VCS NOTICE V-16-1-11022 VCS engine (had) started

root@node1 # grep CPU /var/VRTSvcs/log/engine_A.log
2010/09/15 09:40:30 VCS INFO V-16-1-50086 CPU usage on node1 is 63%
2010/09/15 09:42:30 VCS NOTICE V-16-1-50086 CPU usage on node1 is 78%
2010/09/15 15:18:56 VCS WARNING V-16-1-50086 CPU usage on node1 is 87%
2010/09/16 16:06:26 VCS CRITICAL V-16-1-50086 CPU usage on node1 is 95%
2010/10/04 15:53:54 VCS INFO V-16-1-50086 CPU usage on node1 is 68%
2010/10/04 15:54:24 VCS NOTICE V-16-1-50086 CPU usage on node1 is 76%
2010/10/10 12:28:24 VCS WARNING V-16-1-50086 CPU usage on node1 is 87%
2011/01/06 10:45:27 VCS CRITICAL V-16-1-50086 CPU usage on node1 is 100%

 

Applies To

VCS Cluster 5.0MP3 or higher until 6.1

Issue/Introduction

VCS starts HostMonitor daemon when the VCS engine comes up. VCS uses this HostMonitor daemon to monitor the resource utilization of CPU and Swap. VCS reports to the engine log if the resources cross the threshold limits that are defined for the resources. Some System Administrator may choose to disable this feature and use third party system monitoring software to do this job.