Continuous system crash occurs during VxVM installation on SLES15 SP3 latest minor kernel versions

book

Article ID: 100053817

calendar_today

Updated On:

Description

Error Message

Kernel panic is observed with following stack:

crash> bt

PID: 3353   TASK: ffff8bd79970c000  CPU: 6   COMMAND: "vxconfigd"

#0 [ffffa479c21cf500] machine_kexec at ffffffffb1a6fde1

#1 [ffffa479c21cf558] __crash_kexec at ffffffffb1b598cd

#2 [ffffa479c21cf620] crash_kexec at ffffffffb1b5a78d

#3 [ffffa479c21cf638] oops_end at ffffffffb1a36d3f

#4 [ffffa479c21cf658] no_context at ffffffffb1a82bbf

#5 [ffffa479c21cf6c0] do_page_fault at ffffffffb1a83e40

#6 [ffffa479c21cf6f0] page_fault at ffffffffb240130e

    [exception RIP: bfq_bio_bfqg+37]

    RIP: ffffffffb1e78135  RSP: ffffa479c21cf7a0  RFLAGS: 00010002

    RAX: 000000000000001f  RBX: 0000000000000000  RCX: ffffa479c21cf860

    RDX: ffff8bd779775000  RSI: ffff8bd795b2fa00  RDI: ffff8bd795b2fa00

    RBP: ffff8bd78f136000   R8: 0000000000000000   R9: ffff8bd793a5b800

    R10: ffffa479c21cf828  R11: 0000000000001000  R12: ffff8bd7796b6e60

    R13: ffff8bd78f136000  R14: ffff8bd795b2fa00  R15: ffff8bd7946ad0bc

    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

#7 [ffffa479c21cf7b0] bfq_bic_update_cgroup at ffffffffb1e78458

#8 [ffffa479c21cf7e8] bfq_bio_merge at ffffffffb1e6f47f

#9 [ffffa479c21cf840] blk_mq_submit_bio at ffffffffb1e48c09

#10 [ffffa479c21cf8c8] submit_bio_noacct at ffffffffb1e3c7e3

#11 [ffffa479c21cf958] submit_bio at ffffffffb1e3c87b

#12 [ffffa479c21cf9a8] submit_bh_wbc at ffffffffb1d2536a

#13 [ffffa479c21cf9e0] block_read_full_page at ffffffffb1d27ac1

#14 [ffffa479c21cfa90] do_read_cache_page at ffffffffb1c2f7e5

#15 [ffffa479c21cfb48] read_part_sector at ffffffffb1e546b5

#16 [ffffa479c21cfb60] read_lba at ffffffffb1e595d2

#17 [ffffa479c21cfba8] efi_partition at ffffffffb1e59f4d

#18 [ffffa479c21cfcb8] blk_add_partitions at ffffffffb1e54377

#19 [ffffa479c21cfcf8] bdev_disk_changed at ffffffffb1d2a8fa

#20 [ffffa479c21cfd30] __blkdev_get at ffffffffb1d2c16c

#21 [ffffa479c21cfda0] blkdev_get at ffffffffb1d2c2b4

#22 [ffffa479c21cfdb8] __device_add_disk at ffffffffb1e5107e

#23 [ffffa479c21cfe20] dmp_register_disk at ffffffffc0e68ae7 [vxdmp]

#24 [ffffa479c21cfe50] dmp_reconfigure_db at ffffffffc0e8d8bd [vxdmp]

#25 [ffffa479c21cfe80] dmpioctl at ffffffffc0e75cd5 [vxdmp]

#26 [ffffa479c21cfe90] dmp_ioctl at ffffffffc0e9d469 [vxdmp]

#27 [ffffa479c21cfea8] blkdev_ioctl at ffffffffb1e4ed19

#28 [ffffa479c21cfef0] block_ioctl at ffffffffb1d2a719

#29 [ffffa479c21cfef8] ksys_ioctl at ffffffffb1cfb262

#30 [ffffa479c21cff30] __x64_sys_ioctl at ffffffffb1cfb296

#31 [ffffa479c21cff38] do_syscall_64 at ffffffffb1a0538b

#32 [ffffa479c21cff50] entry_SYSCALL_64_after_hwframe at ffffffffb240008c

    RIP: 00007fbd17fc1c47  RSP: 00007ffebfe27468  RFLAGS: 00000206

    RAX: ffffffffffffffda  RBX: 0000000000000001  RCX: 00007fbd17fc1c47

    RDX: 0000000000c606d0  RSI: 00000000444d5040  RDI: 0000000000000006

    RBP: 0000000000000006   R8: 0000000000000000   R9: 006f6b2e7061706d

    R10: 0000000000000000  R11: 0000000000000206  R12: 0000000000000001

    R13: 0000000000857c38  R14: 00007ffebfe28aac  R15: 0000000000818927

    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

 

Cause

The recent changes regarding BFQ disk i/o scheduler in SLES15 SP3 minor kernel code leads to a panic because of a Null Pointer Dereference bug in kernel code during VxVM installation.

A setup with BFQ disk I/O scheduler will hit this issue for SLES15 SP3 minor kernel version 5.3.18-150300.59.68.1 and above.

 

Resolution

A fix for the issue is in progress. Workaround is to change the disk I/O Scheduler to MQ-DEADLINE.
Steps to change I/O scheduler in SLES15 SP3 are mentioned at https://documentation.suse.com/sles/15-SP3/html/SLES-all/cha-tuning-io.html#cha-tuning-io-switch

Issue/Introduction

Continuous system crash is observed during VxVM installation. Issue is observed for Infoscale 7.4.2/8.0 if the following conditions are met:
a. System has SLES15 SP3 minor kernel version 5.3.18-150300.59.68.1 or above
b. Disk I/O Scheduler is Budget Fair Queueing (BFQ).