Disks are seen in "error" state in Volume Manager "vxdisk list" output:
# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
ibm_vscsi0_0 auto:LVM - - LVM
ibm_vscsi0_1 auto:LVM - - LVM
ibm_vscsi0_2 auto:LVM - - LVM
ibm_vscsi0_3 auto:LVM - - LVM
san_vc0_0 auto - - error
san_vc0_1 auto - - error
san_vc0_2 auto - - error
san_vc1_0 auto - - error
san_vc1_1 auto - - error
san_vc1_2 auto - - error
san_vc2_0 auto - - error
san_vc2_1 auto - - error
san_vc2_2 auto - - error
san_vc3_0 auto:cdsdisk sapGODdg03 sapGODdg online nohotuse
san_vc3_1 auto:cdsdisk sapGODdg04 sapGODdg online nohotuse
san_vc3_2 auto:cdsdisk - (vxfencoorddg) online
san_vc3_3 auto:none - - online invalid
- - sapGODdg01 sapGODdg failed nohotuse was:san_vc1_0
- - sapGODdg02 sapGODdg failed nohotuse was:san_vc1_1
Check for the device errors and the configuration. Confirm that the Operating System can view the device and access the OS label of the devices.
# vxdisk -o alldgs list | egrep "DEVICE|san_vc1_0"
DEVICE TYPE DISK GROUP STATUS
san_vc1_0 auto - - error
- - sapGODdg01 sapGODdg failed nohotuse was:san_vc1_0
As we can see from above, the device is in error state and also seen as in "failed" state in Volume Manager.
# vxdisk list san_vc1_0
Device: san_vc1_0
devicetag: san_vc1_0
type: auto
flags: error private autoconfig
pubpaths: block=/dev/vx/dmp/san_vc1_0 char=/dev/vx/rdmp/san_vc1_0
guid: {945cc878-1dd1-11b2-8e63-0a297c632e49}
udid: IBM%5F2145%5F020063a08b20XX01%5F60050768018E822C80000000000001C9
site: -
errno: No such file or directory <<<====== Note This errno:
Multipathing information:
numpaths: 4
hdisk12 state=enabled
hdisk31 state=enabled
hdisk36 state=enabled
hdisk47 state=enabled
# lsdev -C | egrep "hdisk12|hdisk31|hdisk36|hdisk47"
hdisk12 Available 00-08-02 FC 2145
hdisk31 Available 00-08-02 FC 2145
hdisk36 Available 01-08-02 FC 2145
hdisk47 Available 01-08-02 FC 2145
As we can see, all the paths to the device are in enabled state in Volume Manager and are in Available state in the Operating system. The indicator to proceed is from the "errno" which suggests "No file or directory"
So what file or directory does this refer to? It could be the dmpnode which gets created by vxconfigd for each device in the directory locations in /dev/vx/[r]dmp
Let's check the files in the following locations on the system:
# ls -l /dev/vx/rdmp
total 8
drwxr-xr-x 2 root system 256 Dec 11 07:11 .
drwxr-xr-x 6 root system 4096 Oct 17 14:58 ..
crw------- 1 root system 40, 2 Dec 11 07:11 ibm_vscsi0_0
crw------- 1 root system 40, 1 Dec 11 07:11 ibm_vscsi0_1
crw------- 1 root system 40, 3 Dec 11 07:11 ibm_vscsi0_2
crw------- 1 root system 40, 4 Dec 11 07:11 ibm_vscsi0_3
crw------- 1 root system 40, 7 Dec 11 07:11 san_vc3_0
crw------- 1 root system 40, 6 Dec 11 07:11 san_vc3_1
crw------- 1 root system 40, 5 Dec 11 07:11 san_vc3_2
crw------- 1 root system 40, 8 Dec 11 07:11 san_vc3_3
# ls -l /dev/vx/dmp
total 8
drwxr-xr-x 2 root system 256 Dec 11 07:11 .
drwxr-xr-x 6 root system 4096 Oct 17 14:58 ..
brw------- 1 root system 40, 2 Dec 11 07:11 ibm_vscsi0_0
brw------- 1 root system 40, 1 Dec 11 07:11 ibm_vscsi0_1
brw------- 1 root system 40, 3 Dec 11 07:11 ibm_vscsi0_2
brw------- 1 root system 40, 4 Dec 11 07:11 ibm_vscsi0_3
brw------- 1 root system 40, 7 Dec 11 07:11 san_vc3_0
brw------- 1 root system 40, 6 Dec 11 07:11 san_vc3_1
brw------- 1 root system 40, 5 Dec 11 07:11 san_vc3_2
brw------- 1 root system 40, 8 Dec 11 07:11 san_vc3_3
The device dmpnodes do not exist in /dev/vx/[r]dmp. As we can see, all the devices in error state have missing dmpnodes in the OS.
So the solution is to run "vxdctl enable" to rescan the device tree and re-build the missing dmpnodes for the devices.
However, if a "vxdctl enable" does not resolve the problem, then it is something else preventing the device dmpnodes from being created as device character/block devices in the directory location /dev/vx/[r]dmp
The way to troubleshoot this is to restart vxconfigd in debug mode or simply enable debug mode and then run "vxdctl enable"
We run the "vxconfigd" in debug mode 9
# vxdctl debug 9 /var/tmp/vxconfigd.log
# vxdctl enable
and if we scan for the device name by a grep from the vxconfigd.log, here are the messages seen: (only snippet provided below)
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-20300 assign_disk_local_name: Assign name san_vc1_0 with flag 0x2 to disk with devno 0x280011
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-21656 ddl_set_alias_property: Associating alias hdisk12 of type 0 with DMP device san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-14467 Disk is /dev/rhdisk12, DMP node is san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-21656 ddl_set_alias_property: Associating alias hdisk31 of type 0 with DMP device san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-14467 Disk is /dev/rhdisk31, DMP node is san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-21656 ddl_set_alias_property: Associating alias hdisk36 of type 0 with DMP device san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-14467 Disk is /dev/rhdisk36, DMP node is san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-21656 ddl_set_alias_property: Associating alias hdisk47 of type 0 with DMP device san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-14467 Disk is /dev/rhdisk47, DMP node is san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-15012 dmp_make_mpnode(thread 1078): devno 0x280011 device tag = san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-15020 dmp_make_mpnode:raw pathname = /dev/vx/rdmp//san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-15019 dmp_make_mpnode:block pathname = /dev/vx/dmp//san_vc1_0
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-0 mknod: Cannot make node /dev/vx/rdmp//san_vc1_0: No space left on device
12/11 09:30:03: VxVM vxconfigd DEBUG V-5-1-0 mknod: Cannot make node /dev/vx/dmp//san_vc1_0:No space left on device
As we can see from the above snippet of the vxconfigd debug log, the error is clearly indicating the problem "Cannot make node.... No space left on device".
Hence we need to check what is the space utilization on the root filesystem which is where the /dev/ directory will reside.
# df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 524288 0 100% 9916 88% /
As we can see, the root filesystem is showing space utilization as 100% with 0 Free blocks. Hence, the problem is identified as why are the device dmpnodes unable to be created for some devices, whereas other devices have existing dmpnodes and hence few or all devices may be seen in "error" state in Volume Manager even if the underlying dmp paths are seen as enabled for those devices. This is not a Volume Manager issue but an underlying OS issue which causes the Volume Manager failure to create the required dmpnodes for devices in the Operating System.
If the issue is identical to the above explained scenario, then it will resolve by following the steps as below:
1.) Free up sufficient space in the root (/) filesystem to enable the dmpnodes to be created in /dev/vx/[r]dmp. Also, it is not wise to continue running the Operating System with 100% space utilization for the root filesystem
2.) Perform a "vxdctl enable"
3.) If step 2 above does not resolve the problem, and devices are still seen in error state, then try to restart vxconfigd to restart the daemon. (Please note if system in VCS configuration, it is wise to freeze the system before performing a "vxconfigd -k")
# vxconfigd -k
4.) If restarting vxconfigd also does not resolve the issue then troubleshoot the issue further as there could be multiple issues causing the problem. You may contact the Technical Support if the above steps do not resolve the problem.