From time totime, the vxvm file system or volume is not properly unmounted. One shouldlook into what is causing this by examining the /var/log/messages file. Inmost cases, the cause could come from hard powering the server which causes thefile system to be improperly unmounted or when there is some sort of hardwareerror.
When thereis a problem with the file system, one could find one of the several messagesdisplayed to stdio, the /var/log/messages log and possibly the /var/log/boot.logfile:
UX:vxfsmount.vxfs: ERROR: V-3-21268: /dev/vx/dsk/DDG_Storage/LV_Storage iscorrupted. needs checking.
UX:vxfsmount.vxfs: ERROR: V-3-21268: /dev/vx/dsk/DDG_Storage_Data/LV_Storage_Data iscorrupted. needs checking.
kernel:?vxfs msgcnt 12 offset0x00001fd0 0 0 0 0
kernel:?vxfs msgcnt 12 offset0x00001fe0 0 0 0 0
kernel:?vxfs msgcnt 12 offset0x00001ff0 0 0 0 0
kernel:vxfs: msgcnt 13 mesg 003: V-2-3: vx_mapbad - vx_extfind -/dev/vx/dsk/dg1-Storage/Storage file system free extent bitmap in au 19090makred bad
kernel:?vxfs msgcnt 13 offset0x00000000 0 0 0 0
kernel:?vxfs msgcnt 13 offset0x00000010 0 0 0 0
kernel:?vxfs msgcnt 13 offset0x00000020 0 0 0 0
kernel:?vxfs msgcnt 13 offset0x00000030 0 0 0 0
There are two ways to resolve this. There is the safe method, which involves replaying the journal to commit the journal to the file system, and then there is the not so safe method, which involves performing a full file system check on the partition. In some cases, the second method could lead to removing al lof the volumes and starting over which might mean loosing data. If the first method does not work, fsck will tell you that a full check is required,which requires you to run a full file system check.
In summary,THE FIRST METHOD SHOULD BE USED to start this process. Only when told todo so by fsck, should one perform the second method. In addition, one should seek guidance from the storage foundation team when faced with the fact that one must perform a full file system check on a vxvm volume.
To perform the first method one would perform the following steps:
1. Stop all services:
service cronstop
servicepuredisk stop
service ldapstop
servicevxatd stop
servicesyslog stop
2. Performan fsck on the volume with the vxfs errors:
#/opt/VRTS/bin/ -t /dev/vx/rdsk//
So in ourexample above, it would look something like this:
/opt/VRTS/bin/fsck-t vxfs /dev/vx/rdsk/DDG_Storage_Data/LV_Storage_Data
This commandwill attempt to reply the journal to the file system to resolve thecorruption. You will see the following message during thisprocess:
#/opt/VRTS/bin/fsck -t vxfs/dev/vx/rdsk//
log replayin progress.
If the fsckrun works, you will see the following message:
replaycomplete - marking super-block as CLEAN.
3. Ifthis does work, attempt to manually mount the file system.
4. Ifthe file system can be mounted, verify that it is mounted by typing df-h.
5. Ifyou were able to mount the file system, verify that the data is stillthere.
6. Ifthe above example does not work you will get an error such as thefollowing:
log reply inprogress
log replyfailed to clean the file system
the filesystem is not clean, full fsck required
full filesystem check required, exiting...
If you getthis error message or the one at boot like the screen shot above, you will haveto perform a full file system check. As stated earlier, a full file systemcheck is a risky task, but at this point, it is about the only option youhave. To perform this command, one types the following:
1. /opt/VRTS/bin/fsck-t vxfs -y -o full /dev/vx/rdsk//
2. Thebest thing to do at this stage is to seek guidance from storage foundation, butusually answering yes to the prompts will work to clean the filesystem. It should be noted that in some cases, after the changes have beenmade, you might be able to mount the volume, but the data might begone.
/opt/VRTS/bin/fsck-t vxfs -y -o full /dev/vx/rdsk//
log replay inprogress
pass0 -checking structural files
pass1 -checking inode sanity and blocks
pass2 -checking directory linkage
fileset 999 nolost+found, rebuild ? (ynq)y
pass3 -checking reference counts
pass4 -checking resource maps
fileset 999 au0 imap incorrect - fix (ynq)y
fileset 999 iau0 summary incorrect - fix? (ynq)y
OK to clearlog? (ynq)y
flush filesetheaders? (ynq)y
set state toCLEAN? (ynq)y
3. Oncethis is complete, mount the volume and confirm if the data is stillthere.