The following error messages indicate that the device paths were assigned different OS device numbers on the Linux platform.
MM DD HH:MM:SS server101 kernel: sd 4:0:0:3: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments..
MM DD HH:MM:SS server101 kernel: sd 4:0:0:19: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments
On Solaris the following message is logged.
MM DD HH:MM:SS hosty scsi: [ID 243001 kern.info] /pci@1e,600000/pci@0/pci@9/pci@0/pci@8/SUNW,qlc@1/fp@0,0 (fcp1):
MM DD HH:MM:SS hosty FCP: Report Lun Has Changed target=b0500
One or more of the following symptoms may occur.
1. The DMP configuration is incorrect, as shown by the vxdisk list or vxdmpadm getsubpaths commands. Paths to different LUNs are claimed under one single DMP device.
2. vxconfigd process core-dumped.
3. File system corruption messages are logged in the messages file.
4. File system is disabled because serious corruption is detected.
When device paths are disconnected and reconnected, the operating system will release the OS device numbers for reuse in the future after a certain period of time. For example, on Linux the length of this period is controlled by kernel parameter dev_loss_tmo which is configurable. If the device paths are reconnected after this period has lapsed, the operation system will probably assign different device numbers to the reconnected device paths. Currently, if only some of the paths (not all of them) are assigned different device numbers, the DCPA feature of DMP will not be able to handle this situation and the DMP configuration may become corrupt. If this corrupt DMP configuration is uploaded to the DMP kernel driver, the DMP configuration will be corrupt and can lead to data corruption because data will be written to the wrong disk.
Potentially the issue can affect all platforms, but the Linux platform is more prone to hit the issue.
The problem is fixed in the following patch releases.
Veritas Storage Foundation 5.1SP1RP4 on all platforms. (sfha-
Veritas Storage Foundation 6.0.3 Hot Fix 1 on all platforms. (vm-
The above patches can be downloaded from the Veritas Operation Readiness Tools web site.
https://sort.Veritas.com/patch/matrix
Before the above patch is applied, the temporary workaround on Linux is to increase the default dev_loss_tmo to a high value to prevent device number reuse on a fabirc loss and restore.
Please note, for the limitation on the Linux, normally the maximum of dev_loss_tmo is 600. So, as a workaround, create the file /etc/udev/rules.d/40-rport.rules with the following content line:
KERNEL=="rport-*", SUBSYSTEM=="fc_remote_ports",
ACTION=="add",RUN+="/bin/sh -c 'echo 600 > /sys/class/fc_remote_ports/%k/dev_loss_tmo'"
The above will set dev_loss_tmo to 600 for all Fibre Channel HBA drivers. The maximum allowed value for dev_loss_tmo also depends on the actual Fibre Channel driver.
Please check with the operation system vendor if the FC driver supports value that can be set greater than 600.
NOTE: The above workaround can only prevent the problem if the disconnected device paths are reconnnected within the dev_loss_tmo. If the disconnnected device paths are reconnnected after the dev_loss_tmo, then the OS can assign different device numbers and hence cause the problem.
Details of the above kernel parameter can be found in the following Red Hat document.