Server experienced a panic during reboot when attempting to unmount the filesystem.

book

Article ID: 100066784

calendar_today

Updated On:

Description

Error Message

Panic Stack from the crash dump. 

[Thu May  2 15:04:45 CEST 2024] XFS (VxVM29008): Unmounting Filesystem

[Thu May  2 15:04:45 CEST 2024] XFS (VxVM29009): Unmounting Filesystem

[Thu May  2 15:04:45 CEST 2024] BUG: scheduling while atomic: umount/2999308/0x00000002

[Thu May  2 15:04:45 CEST 2024] Modules linked in: loop nfsv3 nfs_acl nfs lockd grace fscache netfs vxfen(POE) vxodm(POE) dmpsvc(POE) dmpalua(POE) dmpnvme(POE) gab(POE) vxspec(POE) vxio(POE) vxcafs(POE) vxportal(POE) fdd(POE) vxfs(POE) vxdmp(POE) llt(POE) rdma_cm iw_cm ib_cm amf(POE) ib_core veki(POE) 8021q garp mrp stp llc bonding tls rfkill storageapi(OE) binfmt_misc intel_rapl_msr pmt_crashlog pmt_telemetry pmt_class intel_sdsi intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm irqbypass rapl intel_cstate intel_uncore pcspkr mgag200 i2c_algo_bit drm_shmem_helper idxd drm_kms_helper isst_if_mbox_pci isst_if_mmio isst_if_common idxd_bus intel_vsec mei_me syscopyarea sysfillrect acpi_ipmi mei ipmi_si sysimgblt hpilo ipmi_devintf ipmi_msghandler acpi_tad joydev acpi_power_meter vfat fat ext4 mbcache jbd2 auth_rpcgss drm fuse sunrpc xfs libcrc32c sd_mod sg lpfc nvmet_fc nvmet nvme_fc

[Thu May  2 15:04:45 CEST 2024]  crct10dif_pclmul crc32_pclmul nvme_fabrics nvme bnxt_en crc32c_intel tg3 nvme_core ghash_clmulni_intel scsi_transport_fc nvme_common hpwdt t10_pi wmi dm_mod [last unloaded: vxfs]

[Thu May  2 15:04:45 CEST 2024] CPU: 20 PID: 2999308 Comm: umount Kdump: loaded Tainted: P           OE     -------  ---  5.14.0-362.24.1.el9_3.x86_64 #1

[Thu May  2 15:04:45 CEST 2024] Hardware name: HPE ProLiant DL380 Gen11/ProLiant DL380 Gen11, BIOS 1.48 10/19/2023

[Thu May  2 15:04:45 CEST 2024] Call Trace:

[Thu May  2 15:04:45 CEST 2024] 

[Thu May  2 15:04:45 CEST 2024]  dump_stack_lvl+0x34/0x48

[Thu May  2 15:04:45 CEST 2024]  __schedule_bug.cold+0x47/0x53

[Thu May  2 15:04:45 CEST 2024]  schedule_debug.constprop.0+0xc5/0x100

[Thu May  2 15:04:45 CEST 2024]  __schedule+0x48/0x550

[Thu May  2 15:04:45 CEST 2024]  schedule+0x2d/0x70

[Thu May  2 15:04:45 CEST 2024]  schedule_preempt_disabled+0x11/0x20

[Thu May  2 15:04:45 CEST 2024]  __mutex_lock.constprop.0+0x433/0x6a0

[Thu May  2 15:04:45 CEST 2024]  ? iput+0x6a/0x220

[Thu May  2 15:04:45 CEST 2024]  blkdev_get_by_dev.part.0+0x102/0x300

[Thu May  2 15:04:45 CEST 2024]  blkdev_get_by_path+0xb4/0xe0

[Thu May  2 15:04:45 CEST 2024]  amf_plat_fs_attr_cmp_1+0x93/0xb0 [amf]

[Thu May  2 15:04:45 CEST 2024]  ? __pfx_amf_fs_next_mntp+0x10/0x10 [amf]

[Thu May  2 15:04:45 CEST 2024]  amf_fs_multi_notify+0x141/0x3d0 [amf]

[Thu May  2 15:04:45 CEST 2024]  ? __pfx_amf_fson_notify+0x10/0x10 [amf]

[Thu May  2 15:04:45 CEST 2024]  ? __pfx_amf_fson_trigger_all+0x10/0x10 [amf]

[Thu May  2 15:04:45 CEST 2024]  amf_kill_sb+0x13d/0x160 [amf]

[Thu May  2 15:04:45 CEST 2024]  deactivate_locked_super+0x2e/0xa0

[Thu May  2 15:04:45 CEST 2024]  cleanup_mnt+0x100/0x160

[Thu May  2 15:04:45 CEST 2024]  task_work_run+0x59/0x90

[Thu May  2 15:04:45 CEST 2024]  exit_to_user_mode_loop+0x122/0x130

[Thu May  2 15:04:45 CEST 2024]  exit_to_user_mode_prepare+0xb6/0x100

[Thu May  2 15:04:45 CEST 2024]  syscall_exit_to_user_mode+0x12/0x40

[Thu May  2 15:04:45 CEST 2024]  do_syscall_64+0x69/0x90

[Thu May  2 15:04:45 CEST 2024]  ? do_syscall_64+0x69/0x90

[Thu May  2 15:04:45 CEST 2024]  ? handle_mm_fault+0xc5/0x2a0

[Thu May  2 15:04:45 CEST 2024]  ? do_user_addr_fault+0x1d6/0x6a0

[Thu May  2 15:04:45 CEST 2024]  ? exc_page_fault+0x62/0x150

[Thu May  2 15:04:45 CEST 2024]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[Thu May  2 15:04:45 CEST 2024] RIP: 0033:0x7ffff7d4e82b

 

Cause

A native XFS filesystem (not VxFS) has been created on a VxVM volume and Veritas AMF (Asynchronous Monitoring Framework) monitoring the resources.

Veritas Engineering have identified the potential cause of the panic stack for this incident.The issue occurs because AMF holds a spin lock and then calls a kernel function that either takes a long time or needs to sleep.

As a workaround we can disable the AMF for the particular agent.

Veritas Engineering plans to include a fix in upcoming versions, but currently, the workaround is the only available solution.

 

Resolution

Disable intelligent resource monitoring for a particular resource type (Mount agent type).

1) Make the VCS configuration writable.

# haconf -makerw

2) To disable intelligent resource monitoring for all the resources of a certain type, run the following command:

# hatype -modify resource_type IMF -update Mode 0

3) Save the VCS configuration.

# haconf -dump -makero

Issue/Introduction

The server experienced a panic during reboot when attempting to unmount the filesystem. This problem observed on RHEL 9.3 with Infoscale 8.0.2.1500.