Cluster Server system running 7.1 on AIX crashes with GAB client process error

book

Article ID: 100044315

calendar_today

Updated On:

Description

Error Message

GAB: Port h halting system due to client process failure at [14:1106]
 

Cause

VCS deliberately panicked the system after its main daemon - 'had' - became unresponsive.

 

Closer inspection of the associated crash dump revealed the following stacks of the two 'had' daemon threads:

 

(6)> th *|grep had
pvthread+014600  326 had      SLEEP 1460027 002 1024         0  vmmary_mpfs+040A78
pvthread+091C00 2332 had      SLEEP 11C0147 03C 1024         0  pvproc+0149B8

(6)> f 326
pvthread+014600 STACK:
[00074BDC].backt+000000 ()
[006E5A58]vm_forkcopyx+000798 (??, ??, ??, ??, ??, ??)
[006E5E94]vm_forkcopy+000034 (??, ??, ??, ??, ??, ??)
[0029A820]procdup+0000E0 (??)
[0026D5C4]fork_common+000404 (??, ??, ??)
[0026D118]kforkx+000018 ()
[004DA7D0]kfork+000084 ()
[kdb_get_virtual_memory] no real storage @ F0000000300195A0
[kdb_read_mem] no real storage @ FFFFFFFFFFF92D0

(6)> f 2332
pvthread+091C00 STACK:
[000C024C]ep_sleep+0000EC (00000000000F2BE0, A000000000009032 [??])
[000BC5CC]acquire_forkstack_com+00022C (??, ??, ??, ??)
[000BC090]kexit+000010 ()
[000E3A64]issig+000584 (??, ??)
[000F2FC8]sig_deliver+0003E8 (??, ??)
[FFFFFFFFFFFFFFFB]FFFFFFFFFFFFFFFB ()
[kdb_read_mem] no real storage @ FFFFFFFFFFF92D0
 

Since the code was in operating system routines, the matter was referred to IBM (the OS vendor) for comment.

Resolution

IBM responded that the system had suffered from a lack of memory, though didn't elaborate further. Customer was advised to configure additional memory.

Issue/Introduction

Cluster Server (VCS) system running 7.1 on AIX crashes with GAB client process error