System panic by Asynchronous Monitoring Framework driver after upgrading SLES kernel to version 3.12.59-60.45.default

book

Article ID: 100033034

calendar_today

Updated On:

Description

Error Message

Crash dump system message buffer:

crash> log -t 

[...]

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [] amf_ev_add_monitor_info+0x29/0xb0 [amf]
PGD bdb9376067 PUD bdf0a55067 PMD 0
Oops: 0000 [#1] SMP

[...]


Panic:
crash> bt
PID: 5743   TASK: ffff885c7437c200  CPU: 2   COMMAND: "MountAgent"
 #0 [ffff885c741655a0] machine_kexec at ffffffff8103c48c
 #1 [ffff885c741655e0] __crash_kexec at ffffffff810d752a
 #2 [ffff885c741656a0] crash_kexec at ffffffff810d75fd
 #3 [ffff885c741656b0] oops_end at ffffffff81527e68
 #4 [ffff885c741656d0] no_context at ffffffff8151b228
 #5 [ffff885c74165718] __do_page_fault at ffffffff8152a97c
 #6 [ffff885c74165810] page_fault at ffffffff81527208
    [exception RIP: amf_ev_add_monitor_info+41]
    RIP: ffffffffa0c3a519  RSP: ffff885c741658c0  RFLAGS: 00010286
    RAX: ffff885ecd5833c0  RBX: ffff885ecd5833c0  RCX: 0000000000000000
    RDX: 0000000000000038  RSI: 0000000000000000  RDI: ffff885ecd5833f8
    RBP: ffff885eca6d6980   R8: 000000000000000a   R9: ffff885ecd5833c0
    R10: 0000000000000000  R11: ffff885c7416571e  R12: 0000000000000030
    R13: ffff885e38ce1800  R14: ffff885c741659d0  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff885c741658d8] amf_ev_setup_event at ffffffffa0c3a5c7 [amf]
 #8 [ffff885c74165920] amf_event_reg at ffffffffa0c3bc12 [amf]
 #9 [ffff885c74165970] amfioctl at ffffffffa0c55dc3 [amf]
#10 [ffff885c74165ec8] amf_ioctl at ffffffffa0c4b360 [amf]
#11 [ffff885c74165ee0] do_vfs_ioctl at ffffffff811bb744
#12 [ffff885c74165f40] sys_ioctl at ffffffff811bb9a8
#13 [ffff885c74165f80] system_call_fastpath at ffffffff8152ee89
    RIP: 00007f3adf4efbc7  RSP: 00007f3ade8e6cd0  RFLAGS: 00010246
    RAX: 0000000000000010  RBX: ffffffff8152ee89  RCX: 00007f3ade9eeaf0
    RDX: 00007f3ac800cbc0  RSI: 0000000046c0af04  RDI: 000000000000000c
    RBP: 000000000000000c   R8: 0000000000000004   R9: 00007f3ac8000078
    R10: 0000000000000009  R11: 0000000000000246  R12: 0000000000000001
    R13: 00000000004096c2  R14: 0000000000000001  R15: 00007f3ade8e7e30
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

Workaround

Disable IMF for the Mount resource agent on each node in the cluster:
# haimfconfig -display
#Agent         STATUS
Apache         ENABLED
Application    ENABLED
CSSD           ENABLED
Db2udb         ENABLED
DiskGroup      ENABLED
 Mount          ENABLED 
Netlsnr        ENABLED
Oracle         ENABLED
Process        ENABLED
Sybase         ENABLED
SybaseBk       ENABLED

# haimfconfig -disable -agent Mount
VCS INFO V-16-52-53 Log created at '/var/VRTSvcs/log/haimfconfig_A.log'.
VCS WARNING V-16-52-59 This will change configuration files. Do you want to continue (yes/no)?: [yes] yes
IMF is successfully disabled for 'Mount' agent

# haimfconfig -display
#Agent         STATUS
Apache         ENABLED
Application    ENABLED
CSSD           ENABLED
Db2udb         ENABLED
DiskGroup      ENABLED
 Mount          DISABLED 
Netlsnr        ENABLED
Oracle         ENABLED
Process        ENABLED
Sybase         ENABLED
SybaseBk       ENABLED

Cause

A system may panic when VCS version 6.2.1 or 7.1, running on SLES 12 SP1 kernel version 3.12.59-60.45, uses the Asynchronous Monitor Framework (AMF) feature for the Mount agent. The AMF module was not built for SLES 12 SP1 kernel version 3.12.59-60.45, which results in system panic. With the changes in kernel headers in kernel version 3.12.59-60.45, the AMF module needs to be recompiled to support version 3.12.59-60.45.

Resolution

Veritas has re-built the AMF module to support the SLES 12 SP1 kernel version 3.12.59-60.45. No public patch has yet been released. Contact Veritas support to get a private hotfix, which is already available for the following versions:
* VRTSamf 6.2.1.200 --> VRTSamf 6.2.1.201
* VRTSamf 7.1.0 --> VRTSamf 7.1.0.001

Issue/Introduction

If IMF (Intelligent Monitoring Framework) is used for monitoring Mount resources within VCS (Veritas Cluster Server), all nodes of the cluster will panic after upgrading the SLES 12 SP1 kernel to version 3.12.59-60.45.

Additional Information

ETrack: 3898808