GAB initiated panic on AIX with low system utilization
book
Article ID: 100023102
calendar_today
Updated On:
Resolution
In the errlog, the following entry is found:
---------------------------------------------------------------------------
LABEL: KERNEL_PANIC
IDENTIFIER:225E3B63
Date/Time: Mon Nov 3 12:13:102008
Sequence Number: 1074016
Machine Id: 00344C0C4C00
Node Id: xxxxxxx
Class: S
Type: TEMP
Resource Name: PANIC
Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
ASSERT STRING
PANIC STRING
GAB: Port h halting system due to client process failure
---------------------------------------------------------------------------
The stack trace of the stack which caused the panic is:
(3)>f
pvthread+00DC00 STACK:
[00021B50].panic_trap+000000()
[0896632C]gab_halt+000080 (??)
[0896632C]gab_halt+000080(??)
[08961B6C]gab_kill_process+0000A8 (??)
[08958D58]gab_timerscan+00032C(??)
[089547E0]gab_timeout_daemon+000080 (??)
[000FF624]procentry+000010(??, ??, ??, ??)
Usually this happens when the system is under stress and HAD (running in user space) does not get CPU cycles in time to respond to GAB (in kernel space).
However, it has been seen on AIX that HAD can be paged out even if the system is more or less idle.
To avoid this, it is advised to change the setting for lru_file_repage from 1 to 0.
If lru_file_repage is set to 1, computational and non-computational pages can be swapped out. This includes HAD, and a paged-out HAD can easily miss a response to a GAB heartbeat.
To check the current value of lru_file_repage:
# vmo -o lru_file_repage
To change the value of lru_file_repage:
# vmo-p -o lru_file_repage=0
or 1 to enable
Issue/Introduction
GAB initiated panic on AIX with low system utilization
Was this article helpful?
thumb_up
Yes
thumb_down
No