Jan 19 17:45:06 systemA /scsi_vhci/disk@g5002361004752310 (sd0): Command Timeout on path mpt_sas23/disk@w5002361004752310,0: 198311c4af2b8c19
Jan 19 17:45:13 systemA /scsi_vhci/disk@g5002361004770660 (sd9): Command Timeout on path mpt_sas26/disk@w5002361004770660,0: 177040bd156f5935
It seems that new changes were introduced in Sol11.4 so that when handling MHIOCSTATUS(cmd 4d04) a SCSI WRITE with zero length will be issued to the array to check the access.
The vxconfigd stack tells us that Veritas successfully passed ldi_handle cmd(4d04) along with other args and Oracle initialized SCSI WRITE(0x2a) to handle cmd(4d04).
When dealing with a reserved device (for storage devices/arrays that are SPC-3 compliant), a TUR (Test Unit Ready) is first issued which is supposed to succeed. The write(10) with zero LBA is issued to check the access rights and again this command should succeed. The command will continue to be tried until the expected response is received from the storage (ie it succeeds) and so it can be waiting (stuck ) in biowait for quite some time.
Oracle was able to confirm that the NULL write i/os were occurring on the SunFire F80 storage and that the issue was related to Oracle Bug 30748237 introduced in Solaris 11.4.16.4.0 and fixed in Solaris 11.4.21.69.0 or later and documented under following alert:
Solaris 11.4 System I/O Failure Due To Solaris I/O Multipathing (MPxIO/scsi_vhci) Retry Counter Exhaustion For Asymmetric Storage ( Doc ID 2652657.1 )
Veritas would recommend that in the event that such a hang is encountered that Oracle is engaged to confirm if the hang has the same root-cause.