How to restore a disk marked failed caused by link failure or I/O transport failure, but not because of the bad disk
book
Article ID: 100006454
calendar_today
Updated On:
Resolution
If vxdisk marked failed, which is caused by the link offline, I/O transport failure, but not because of the bad disk, the following steps can help to recover the volume:
1. Restore the VERITAS Volume Manager (tm) information to the failed disk. From the vxdisk list output:
# vxdisk list
- - testdg01 testdg failed was:c1t12d0s2
- - testdg02 testdg failed was:c1t13d0s2
2. Use the vxdiskadm utility to remove and replace the failed drive:
# vxdiskadm
Choose item no. 4 to 'Remove a disk for replacement'
Note: The data in the public region will not be lost.
Before use the disk as a replace disk, initialize it first in another session by:
vxdisksetup -if accessname <<<<<< if format/attribute is not default, need use same attribute, format as previous.
Note: Reinitialize the disk only to restore Volume Manager information to the private region.
Then choose item no. 5 'Replace a failed or removed disk'
After running items 4 and 5, the vxdisk list output should be similar to the following output:
# vxdisk list
c1t12d0s2 sliced testdg01 testdg online
c1t13d0s2 sliced testdg02 testdg online
c1t14d0s2 sliced testdg03 testdg online
3. Recover the volume by performing the following:
# vxprint -htg testdg
dm testdg01 c1t12d0s2 sliced 2179 8920560
dm testdg02 c1t13d0s2 sliced 2179 8920560
dm testdg03 c1t14d0s2 sliced 2179 8920560
v testvol - DISABLED ACTIVE 204800 RAID -
pl testvol-01 testvol DISABLED RECOVER 204864 CONCAT - RW
sd testdg01-01 testvol-01 testdg01 0 102460 c1t12d0 ENA
sd testdg02-01 testvol-01 testdg02 0 102460 c1t13d0 ENA
sd testdg03-01 testvol-01 testdg03 0 102460 c1t14d0 ENA
# vxmend -g testdg -o force off testvol-01
# vxmend -g testdg on testvol-01
Note: This changes the plex state from RECOVER to STALE
# vxprint -htg testdg
dm testdg01 c1t12d0s2 sliced 2179 8920560
dm testdg02 c1t13d0s2 sliced 2179 8920560
dm testdg03 c1t14d0s2 sliced 2179 8920560
v testvol - DISABLED ACTIVE 204800 RAID -
pl testvol-01 testvol DISABLED STALE 204864 CONCAT - RW
sd testdg01-01 testvol-01 testdg01 0 102460 c1t12d0 ENA
sd testdg02-01 testvol-01 testdg02 0 102460 c1t13d0 ENA
sd testdg03-01 testvol-01 testdg03 0 102460 c1t14d0 ENA
# vxmend -g testdg fix clean testvol-01
Note: This makes the plex in CLEAN state
# vxprint -htg testdg
dm testdg01 c1t12d0s2 sliced 2179 8920560
dm testdg02 c1t13d0s2 sliced 2179 8920560
dm testdg03 c1t14d0s2 sliced 2179 8920560
v testvol - DISABLED ACTIVE 204800 RAID -
pl testvol-01 testvol DISABLED CLEAN 204864 CONCAT - RW
sd testdg01-01 testvol-01 testdg01 0 102460 c1t12d0 ENA
sd testdg02-01 testvol-01 testdg02 0 102460 c1t13d0 ENA
sd testdg03-01 testvol-01 testdg03 0 102460 c1t14d0 ENA
# vxvol -g testdg start testvol
# vxprint -htg testdg
dm testdg01 c1t12d0s2 sliced 2179 8920560
dm testdg02 c1t13d0s2 sliced 2179 8920560
dm testdg03 c1t14d0s2 sliced 2179 8920560
v testvol - ENABLED ACTIVE 204800 RAID -
pl testvol-01 testvol ENABLED ACTIVE 204864 CONCAT - RW
sd testdg01-01 testvol-01 testdg01 0 102460 c1t12d0 ENA
sd testdg02-01 testvol-01 testdg02 0 102460 c1t13d0 ENA
sd testdg03-01 testvol-01 testdg03 0 102460 c1t14d0 ENA
4. Run a file system utility check and mount the file system. Below is the command for 'vxfs' file system:
# fsck -F vxfs /dev/vx/rdsk/testdg/testvol
Note: The file system is clean - log replay is not required
# mount -F vxfs /dev/vx/dsk/testdg/testvol /testvol
# df -k /testvol
File system kbytes used avail capacity mounted on
/dev/vx/dsk/testdg/testvol
44429208 39734216 4658376 90% /testvol
Issue/Introduction
How to restore a disk marked failed caused by link failure or I/O transport failure, but not because of the bad disk
Was this article helpful?
thumb_up
Yes
thumb_down
No