VCS Agent for Sybase - Sybase dataserver killed by SIGPIPE because the online script exits before Sybase dataserver writes the startup messages
book
Article ID: 100001508
calendar_today
Updated On:
Description
Error Message
2010/04/06 23:43:25 VCS NOTICE V-16-1-10301 Initiating Online of Resource SybDB (Owner: unknown, Group: asm filesystem) on System im001
2010/04/06 23:45:37 VCS ERROR V-16-2-13066 (im001) Agent is calling clean for resource(SybDB) because the resource is not up even after online completed.
Resolution
The online script in the Veritas Cluster Server (VCS) Agent for Sybase does not wait for the Sybase dataserver to print the startup messages and exits before that. This causes the operating system to send "PipeBroken" signal (SIGPIPE) to the dataserver. The default handler for the SIGPIPE signal is to exit the program.
The following is a Solaris truss output which shows the problem.
The online script is executed by the Sybase Agent. (Please note the following is Sybase resource configured inside a Solaris zone, so zlogin is used. If Solaris zone is not used, the online script will be called directly.)
28969/3: 7.1088execve("/usr/sbin/zlogin", 0xFEC7811C, 0x0002D694) argc =61
28969/1: argv: /usr/sbin/zlogin db2testzone/bin/sh -c '
28969/1: CLUSTER_LOGDBG="";exportCLUSTER_LOGDBG;
28969/1: VCS_LOG_AGENT_NAME="Sybase";exportVCS_LOG_AGENT_NAME;
28969/1: CLUSTER_HOME="/opt/VRTSvcs";export CLUSTER_HOME;
28969/1: VCS_AGFW="1";export VCS_AGFW;VCS_CONF="/etc/VRTSvcs";
28969/1: exportVCS_CONF; VCS_HOME="/opt/VRTSvcs"; exportVCS_HOME;
28969/1: VCS_LOG="/var/VRTSvcs";export VCS_LOG;
28969/1: cd/opt/VRTSagents/ha/bin/Sybase;
28969/1: "/opt/VRTSagents/ha/bin/Sybase/online""SybEQIQ-DB" "Server" <<< online script iscalled.
28969/1: "1" "SYBUS_EQIQ1" "Owner""1" "sybase" "Home""1"
28969/1: "/cfs/fs10/qa/eqiq_local/ASE15""Version" "1" "15.5" "SA" "1"
28969/1: "sa""SApswd" "1" "XXXXXX" "User" "1" """UPword"
28969/1: "1" "" "Db" "1" """Table" "1" "" "Monscript""1"
28969/1: "/opt/VRTSagents/ha/bin/Sybase/SqlTest.pl""DetailMonitor"
28969/1: "1" "0""Run_ServerFile""1"
28969/1: "/cfs/fs10/qa/eqiq_local/ASE/ASE-15_0/install/RUN_SYBUS_EQIQ1"
28969/1: '
The online script executes the "startserver" command.
28985/1: 7.9447execve("/cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0/install/startserver", 0x00057784,0x00057798) argc = 3
28985/1: argv:
28985/1: /cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0/install/startserver-f
28985/1: /cfs/fs10/qa/eqiq_local/ASE/ASE-15_0/install/RUN_SYBUS_EQIQ1
The "startserver" command then starts the "dataserver".
28993/1: 8.1777execve("/cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0/bin/dataserver", 0x0003ACC4,0x0003ACE0) argc = 6
28993/1: argv:/cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0/bin/dataserver
28993/1: -sSYBUS_EQIQ1-d/cfs/fs10/qa/eqiq_local/ASE15/data/master.dat
28993/1: -e/cfs/fs10/qa/eqiq_local/ASE15/log/SYBUS_EQIQ1.log
28993/1: -c/cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0/SYBUS_EQIQ1.cfg
28993/1: -M/cfs/fs10/qa/eqiq_local/ASE15/ASE-15_0
Note that the online script process (process id 28969) exists before the "dataserver" process (process id 28993) writes the startup messages.
28969/1: 8.2216_exit(10) <<< online script exits
28993/1: 8.4279 write(1, " 0 0 : 0 0 :0 0 0 0 0 :".., 124) Err#32 EPIPE <<< dataserver tries to write the startup messages
28993/1: 8.4281 Received signal #13, SIGPIPE [default] <<< dataserver gets SIGPIPE and the default SIGPIPE handler is to exit the program
The dataserver process receives the SIGPIPE signal and exits, as a result the dataserver cannot start properly.
The problem is already fixed in the VCS5.1 release of the Sybase Agent. For VCS 5.0 the problem will be fixed in the next 5.0MP3 Rolling Patch release. Please check the release notes of the next 5.0MP3 Rolling Patch for the fix through Etrack incident mentioned in the Supplemental Material section of this article.
The new version of the online script will have a new resource attribute "WaitForRecovery". By enabling the "WaitForRecovery" attribute, the online script will continue to check the recovery state of the database instance until a user-specified timeout value. This will allow the dataserver enough time to print the startup messages successfully. After the startup messages are printed, the dataserver will change the SIGPIPE handler to ignore the signal.
Before the fix is available, a temporary workaround is to delay the exit of the online script. Please add the following perl statement (sleep(30))before the exit(10) statement in the onlinescript.
/opt/VRTSagents/ha/bin/Sybase/online:
....
sleep(30); # added a sleep(30) statement to sleep 30 seconds and allow the dataserver to print the startup messages
#
# Delay first monitor by 10seconds
#
exit 10;
Issue/Introduction
VCS Agent for Sybase - Sybase dataserver killed by SIGPIPE because the online script exits before Sybase dataserver writes the startup messages
Was this article helpful?
thumb_up
Yes
thumb_down
No