Kernel panic - not syncing: GAB: Port h halting system due to client process failure on Infoscale 7.3.1 and RedHat 7

book

Article ID: 100062829

calendar_today

Updated On:

Description

Error Message

The machine below shows an extract from its panic and resulting crash dump:

KERNEL: ./vmlinux [TAINTED]
DUMPFILE: ./vmcore.flat [PARTIAL DUMP]
CPUS: 40
DATE: Thu Dec 14 20:06:35 EST 2023
UPTIME: 693 days, 08:55:22
LOAD AVERAGE: 280.72, 201.75, 117.37
TASKS: 4153
NODENAME: xxxxxxxxx
RELEASE: 3.10.0-693.21.1.el7.x86_64
VERSION: #1 SMP Fri Feb 23 18:54:16 UTC 2018
MACHINE: x86_64 (2200 Mhz)
MEMORY: 255.7 GB
PANIC: "Kernel panic - not syncing: GAB: Port h halting system due to client process failure at [14:1120]"
PID: 7242
COMMAND: "StreamServer"
TASK: ffff882b31a12f70 [THREAD_INFO: ffff880065990000]
CPU: 31
STATE: TASK_RUNNING (PANIC)

 

Cause

The panic string:

PANIC: "Kernel panic - not syncing: GAB: Port h halting system due to client process failure at [14:1120]"
 

points to the Cluster Server had daemon becoming unresponsive.

Looking at what the had daemon was doing, there are 2 threads:

crash> ps|grep had
7066 34577 24 ffff881745936eb0 UN 0.0 223692 20404 had
34577 1 2 ffff881fbd804f10 UN 0.0 223692 131704 had
34581 1 1 ffff883fb4efdee0 IN 0.0 131816 1148 hashadow
crash>

PID: 7066 TASK: ffff881745936eb0 CPU: 24 COMMAND: "had"
#0 [ffff8820402438b8] __schedule at ffffffff816b3ac4
#1 [ffff882040243948] schedule at ffffffff816b40e9
#2 [ffff882040243958] schedule_timeout at ffffffff816b1a49
#3 [ffff882040243a08] io_schedule_timeout at ffffffff816b35ed
#4 [ffff882040243a38] io_schedule at ffffffff816b3688
#5 [ffff882040243a48] bit_wait_io at ffffffff816b2071
#6 [ffff882040243a60] __wait_on_bit_lock at ffffffff816b1d91
#7 [ffff882040243aa0] __lock_page_killable at ffffffff81186594
#8 [ffff882040243af8] generic_file_aio_read at ffffffff811887c5
#9 [ffff882040243bc8] xfs_file_buffered_aio_read at ffffffffc02a6c55 [xfs]

 

PID: 34577 TASK: ffff881fbd804f10 CPU: 2 COMMAND: "had"
#0 [ffff881f54f677d8] __schedule at ffffffff816b3ac4
#1 [ffff881f54f67868] schedule at ffffffff816b40e9
#2 [ffff881f54f67878] schedule_timeout at ffffffff816b1a49
#3 [ffff881f54f67920] wait_for_completion at ffffffff816b449d
#4 [ffff881f54f67980] xfs_buf_submit_wait at ffffffffc02a2bc9 [xfs]
#5 [ffff881f54f679a8] _xfs_buf_read at ffffffffc02a2cc3 [xfs]
#6 [ffff881f54f679b8] xfs_buf_read_map at ffffffffc02a2dd9 [xfs]
#7 [ffff881f54f67a00] xfs_trans_read_buf_map at ffffffffc02d2b99 [xfs]
#8 [ffff881f54f67a40] xfs_imap_to_bp at ffffffffc02967ee [xfs]
#9 [ffff881f54f67aa8] xfs_iread at ffffffffc0296f6e [xfs]
#10 [ffff881f54f67b10] xfs_iget at ffffffffc02ac2d6 [xfs]
#11 [ffff881f54f67bb0] xfs_lookup at ffffffffc02b50a7 [xfs]
#12 [ffff881f54f67bf8] xfs_vn_lookup at ffffffffc02b1cfb [xfs]

 

Both threads are held in RedHat's XFS filesystem-related code.

 

Resolution

Please note that this article references sites not owned or maintained by Veritas and, as such, Veritas is not responsible for the content portrayed on such sites, including any revisions to or deletions of content or third-party software on which this article relies. User is responsible for conducting all necessary due diligence prior to following the instructions described in this article.

This issue needs to be referred to the operating system vendor for comment on why threads are hanging within the RedHat XFS layer.

 

Issue/Introduction

Kernel panic - not syncing: GAB: Port h halting system due to client process failure