VCS INFO V-16-20002-211 (gsp-pcmdb1) Oracle:OPCMP1:monitor:Monitor procedure /opt/VRTSagents/ha/bin/Oracle/SqlTest.pl returned the output: ERROR:
ORA-28002: the password will expire within 7 days
VCS ERROR V-16-2-13067 (gsp-pcmdb1) Agent is calling clean for resource(OPCMP1) because the resource became OFFLINE unexpectedly, on its own.
If the DetailMontor script SqlTest.pl receives any Oracle error "ORA-XXXXX" from the sqlplus commands, it will return OFFLINE to the Oracle Agent together with the received ORA-XXXXX code. The Oracle Agent (/opt/VRTSagents/ha/bin/Oracle/OracleAgent) will further process this Oracle code according to oraerror.dat.
Please refer to the VCS Agent for Oracle Installation and Configuration Guide for section : How the agent handles Oracle error codes during detail monitoring.
One possible cause for the agent to return OFFLINE is that the oraerror.dat entries are not able be read by the agent when the agent started.
According to the table Predefined agent actions for Oracle errors, if the oraerror.dat file is not available, the agent assumes the default behavior of returning OFFLINE (FAILOVER). Note the agent will also OFFLINE if oraerror.dat file is empty or contains no valid entries.
Please note that the oraerror.dat file is loaded by the Oracle Agent only during the agent start. If the file is modified, the agent has to be restarted in order to have the agent recognize the change.
On a Solaris system we can check the gcore of the OracleAgent process to confirm if the oraerror.dat was loaded by the agent when it started. For example, in the 5.1GA oraerror.dat file, there are 187 entries.
# egrep -v '^$|^#|}' oraerror.dat | wc -l
187
Checking the gcore of the OracleAgent process, the data structure "token_data" stores the in-memory oraerror.dat. The 7th field is the number of entries in the oraerror.dat file plus 1.
# ps -ef |grep OracleAge root 2946 1 1 13:02:43 ? 0:00 /opt/VRTSagents/ha/bin/Oracle/OracleAgent -type Oracle -agdir /opt/VRTSagents/h
# gcore 2946gcore: core.2946 dumped
# mdb /opt/VRTSagents/ha/bin/Oracle/OracleAgent core.2946
mdb> *token_data/20D0x9ecd0: 32 789216 0 815576 815672
668792 188 16961 1162824517 1330332928 <<< 187 + 1 = 188
121 1634299438 1112492800 100 842019121
758133805 808525873 859451442 976499488 542852678
If due to some reasons, the oraerror.dat file was not loaded by the OracleAgent when it started, please first make sure the oraerror.dat file exists and is readable and contains the valid entries, and then restart the Oracle Agent.
# pwd/opt/VRTSagents/ha/bin/Oracle
# ls -l oraerror.dat-rwxr--r-- 1 root root 3615 Apr 1 13:02 oraerror.dat
# haagent -stop Oracle -force -sys
# haagent -start Oracle -sys
Check the gcore of the OracleAgent process again to make sure that the oraerror.dat entries are loaded successfully.
Oracle agent returned OFFLINE even though the Oracle error (ORA-XXXXX) should be ignored according to oraerror.dat