Output of ha commands such as hastatus, halog, hares and hagrp:
# /opt/VRTS/bin/hastatus -sum
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
Sample stack trace of one of the hares commands (from core dump):
==== user (LWP_SYS) thread: 0x30001932d00 PID: 6401 ====
cmd: hares -state IP_wMPortal3 -sys tlxkswmbrkr1
t_wchan: 0x3001e89109a sobj: condition var (from genunix:poll_common+0x4e8)
[...]
idle: 54934992 ticks (6 days 8 hours 35 minutes 49.92 seconds)
[...]
genunix:cv_wait_sig_swap_core+0x130(, , 0x0)
genunix:cv_waituntil_sig(0x3001e89109a, 0x3001e891060, 0x0) - frame recycled
genunix:poll_common+0x4e8(0xffbf9360?, 0x1, 0x0, 0x0, , 0x60017cf5310)
genunix:pollsys+0xf8(, 0x1)
unix:syscall_trap32+0xcc()
-- switch to user thread's user stack --
This issue is tracked via Symantec internal incident e2416758.
This issue occurs only when the operating system runs out of file descriptor. Use the ulimit command to determine the maximum limit of file descriptor in the system:
# ulimit -n
1024
#
The VCS engine, "had" process can accept a limited number of connections from clients. This limit (FD_SETSIZE) is determined by the operating system. However, the accept system call can return a file descriptor greater than the limit. In such a case "had" cannot process this file descriptor using the select system call. As a result "had" goes into a unrecoverable loop.
Veritas has fixed the code to ensure that "had" will close a file descriptor which is greater than FD_SETSIZE. This prevents "had" process from going into a unrecoverable loop. The fix is available in the following patch releases.
Solaris (SPARC and x86 platforms), AIX and Linux
VRTSvcs 5.1SP1RP2 and 6.0
HP-UX
VRTSvcs 5.1SP1RP1 and 6.0
Please visit SORT website to obtain the latest patch.
Applies To
This issue is applicable to all versions of VCS and OS (Operating System) platforms.
The CPU consumption of VCS (Veritas Cluster Service) engine "had" process is very high. The had process does not respond to any HA command. HA commands such as hastatus, halog, hares and hagrp would return "cannot connect to vcs engine" and the threads would be hung in pollsys() system call.
ETrack: 2416758 ETrack: 2416842