System panic caused by VCS AMD driver in amf_uqget

book

Article ID: 100028953

calendar_today

Updated On:

Description

Error Message

A typical panic stack trace in Solaris:

CAT(vmcore.0/10U)> panic
panic on CPU 74
panic string:   BAD TRAP: type=31 rp=2a104ef1070 addr=0 mmu_fsr=0 occurred in module "amf" due to a NULL pointer dereference
==== panic user (LWP_SYS) thread: 0x3004dd44720  PID: 8468  on CPU: 74  affinity CPU: 74 ====
cmd: /opt/VRTSamf/bin/imfd
t_procp: 0x6014a932610
  p_as: 0x60148ea49c8  size: 9756672  RSS: 4186112
  hat: 0x3004cc35b00
    cnum: CPU0:5/1378 CPU2:1/1851 CPU4:1/1190 CPU8:1/358 CPU12:1/310 CPU32:12/347 CPU34:1/268 CPU36:1/220 CPU38:1/262 CPU40:1/154 CPU42:1/81 CPU44:1/91 CPU46:1/152 CPU64:13/23 CPU66:2/1900 CPU68:1/636 CPU70:1/566 CPU72:1/1293 CPU74:1/2276 CPU76:1/954 CPU78:1/1285 CPU96:15/127 CPU98:2/1595 CPU100:1/5124 CPU102:1/7123 CPU114:1/643 CPU122:1/685 CPU124:1/583
    cpusran: 0,2,3,4,5,9,12,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,96,97,98,99,100,101,102,114,122,124
  zone: global
t_stk: 0x2a104ef1ae0  sp: 0x191e971  t_stkbase: 0x2a104eec000
t_pri: 59(TS)ctf_field_read(thread.t_lbolt) failed: field "t_lbolt" not found in structure "kthread_t"
  t_tid: 1  pctcpu: 0.000000
t_lwp: 0x6014b283410  machpcb: 0x2a104ef1ae0
  mstate: LMS_SYSTEM  ms_prev: LMS_KFAULT
  ms_state_start: 0.0001285 seconds earlier
  ms_start: 1 days 16 hours 40 minutes 26.4118785 seconds earlier
psrset: 0  last CPU: 74
idle: 55022 ticks (9 minutes 10.22 seconds)
start: Fri Feb  8 13:55:51 2013
age: 146427 seconds (1 days 16 hours 40 minutes 27 seconds)
syscall: #54 ioctl(, 0xffbffa00) (sysent: genunix:ioctl+0x0)
tstate: TS_ONPROC - thread is being run on a processor
tflg:   T_PANIC - thread initiated a system panic
        T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
        TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag:  SMSACCT - process is keeping micro-state accounting
        SMSFORK - child inherits micro-state accounting

pc:      unix:panicsys+0x48:   call     unix:setjmp

unix:panicsys+0x48(0x10a9098, 0x2a104ef0e18, 0x191f340, 0x1, , , 0x9900001606, , , , , , , , 0x10a9098, 0x2a104ef0e18)
unix:vpanic_common+0x78(0x10a9098, 0x2a104ef0e18, 0xabe, 0x2a104ef0911, 0x2a104ef1110, 0x1601)
unix:panic+0x1c(0x10a9098, 0x31, 0x2a104ef1070, 0x0, 0x0, 0x6013f1be210, 0x183a410)
unix:die+0x78(0x31, 0x2a104ef1070, 0x0, 0x0)
unix:trap+0x9e0(0x2a104ef1070, 0x0)
unix:ktl0+0x48()
-- trap data  type: 0x31 (data access MMU miss)  rp: 0x2a104ef1070  --
  addr: 0x0
pc:  0x7aa80f98 amf:amf_uqget+0x1e8:   ldsw     [%l0], %l0
npc: 0x7aa80f9c amf:amf_uqget+0x1ec:   sub        %l0, 0x1, %l0   ( dec   %l0 )
  global:                       %g1               0x60
        %g2                0x2  %g3               0x60
        %g4          0x1833000  %g5          0x1831000
        %g6              0x8e4  %g7      0x3004dd44720
  out:  %o0         0x7030f0a8  %o1         0x7030f060
        %o2                  0  %o3             0x66d8
        %o4                  0  %o5                0xc
        %sp      0x2a104ef0911  %o7         0x7aa80ee4
  loc:  %l0                  0  %l1         0x7030f058
        %l2      0x3000d88c1d8  %l3      0x2a104ef1344
        %l4         0xff33259c  %l5      0x2a104ef17f0
        %l6                  0  %l7          0x1977a88
  in:   %i0      0x2a104ef1570  %i1      0x2a104ef1344
        %i2                0xc  %i3      0x6014a95afa4
        %i4      0x60148d85740  %i5      0x6014a932610
        %fp      0x2a104ef09f1  %i7         0x7aa89318
amf:amf_uqget+0x1e8(0x2a104ef1570, 0x2a104ef1344, 0xc, 0x6014a95afa4, 0x60148d85740, 0x6014a932610)
amf:amfioctl+0x19d0(0x40046113, 0xffbffa64, 0x0, 0x4, 0x300028c1000, 0x9b8)
amf:amf_ioctl+0x7c(0x13700000000, 0x40046113, 0xffbffa64, 0x100003, 0x600e7c169a8, 0x2a104ef1adc)
specfs:spec_ioctl(0x60148d85740, 0x40046113, 0xffbffa64, 0x100003, , 0x2a104ef1adc) - frame recycled
genunix:fop_ioctl+0x2c(0x60148d85740, 0x40046113, 0xffbffa64, 0x100003, , 0x2a104ef1adc)
genunix:ioctl+0x184()
unix:syscall_trap32+0xcc()
-- switch to user thread's user stack --

Cause

The libusnp_vxnotify.so library used for disk group notifications, goes into an infinite loop when vxconfigd daemon is unresponsive. This causes AMF to enter an inconsistent state as a result of which AMF driver panics the node.

Resolution

Veritas has modified the AMF driver and libusnp_vxnotify.so library, to address this issue.  Patches for VRTSamf and VRTSvcsag are required.

Issue/Introduction

AMF driver panics the node when vxconfigd is unresponsive.

Additional Information

ETrack: 3090243