CRASH INFORMATION:
CPU 56 CSA F100091561146D00 at time of crash, error code for LEDs: 70000000
pvthread+153600 STACK:
[00008BF0].simple_unlockir+000070 ()
[07DA24A8]07DA24A8 ()
[07DB3B20]07DB3B20 ()
[07DA2FB0]07DA2FB0 ()
[000D464C]clock+0002CC (??)
[00177378]i_softmod+0004F8 ()
[00142770]flih_util+000258 ()
____ Exception (F00000002FF47600) ____
In AIX 7.2 TL5 IBM introduced a performance feature for unlocking the interrupt disabled path. The change asserts that there are no waiters on this lock, since it is only correct that all threads contending on a given lock must be disabled (and spin rather than sleep) or they must all be enabled (intbase). In cases where the lock holder is on another cpu, it would eventually release the lock and the disabled lock would succeed. If a strong error check (eg errorcheckdetail 7 ) is enabled and IBM detects incorrect locking semantics, it will trigger an assertion (system panic).
In the corefiles it was observed that LLT had the lock disabled, but then, on a different thread, it tried to take the lock at intbase.
LLT registered a timer with AIX by specifying interrupt priority INTTIMER. The timer handler was unlocking llt_poll_req_lock .
CRASH INFORMATION:
CPU 56 CSA F100091561146D00 at time of crash, error code for LEDs: 70000000
pvthread+153600 STACK:
[00008BF0].simple_unlockir+000070 ()
[07DA24A8]07DA24A8 ()
[07DB3B20]07DB3B20 ()
[07DA2FB0]07DA2FB0 ()
[000D464C]clock+0002CC (??)
[00177378]i_softmod+0004F8 ()
[00142770]flih_util+000258 ()
____ Exception (F00000002FF47600) ____
Another LLT thread for polling mac address of private link was acquiring the same lock at INTBASE.
(60)> th pvthread+18D600
SLOT NAME STATE TID PRI RQ CPUID CL WCHAN
pvthread+18D600 6358 llt_poll SLEEP D60301 03C 80 257 llt_poll_req_lock slist_table+000A80
(80)> f 6358
pvthread+18D600 STACK:
[006CE9C0]slock+000580 (0000000000137D3C, 8000000000001032 [??])
[0000956C].simple_lock+00006C () <<<<<<<
[F1000915901131A0]F1000915901131A0 ()
[00014D70].hkey_legacy_gate+00004C ()
[07DA2620].llt_poll_procfunc+000080 ()
[00486230]procentry+000010 (??, ??, ??, ??)
[kdb_read_mem] no real storage @ FFFFFFFFFFF9220
LLT locking has been fixed in the VRTSllt 7.4.1.1101 hotfix.
A hotfix is now available for this issue in the current version(s) of the product(s) mentioned. Please contact Veritas Technical Support to obtain the hotfix.