fsck -o nolog doesn't fix CFS file system with corrupt intent log

book

Article ID: 100015948

calendar_today

Updated On:

Description

Error Message

The filesystem is corrupt because there are invalid intent log entries.   Because of the invalid intent log entries, a normal simple fsck fails to replay the intent log.

# /opt/VRTS/bin/fsck /dev/vx/rdsk/cvmdg/avol01
log replay in progress
UX:vxfs fsck: ERROR: V-3-26112: bc_rgetblk bad request, devid/blknum = 4294967295/9223372036854775807, sz = 1024

Running fsck with "-o full" option also reports the same problem.
 
# /opt/VRTS/bin/fsck -o full /dev/vx/rdsk/cvmdg/avol01
log replay in progress
UX:vxfs fsck: ERROR: V-3-26112: bc_rgetblk bad request, devid/blknum = 4294967295/9223372036854775807, sz = 1024
 
***** NOTE NOTE NOTE *****
If a filesystem was not cleanly unmounted previously, the last filesystem metadata updates before the disabling of the filesystem will be kept in the intent log.  The metadata kept in the intent log is a crucial part of the overall filesystem metadata.   Running fsck with the "-o nolog" option will skip the replay of the intent log.   Without replaying the intent log, the filesystem will always be inconsistent and fsck will always complain about the inconsistency and treats the filesystem as corrupt.   fsck will try to fix the inconsistency by removing some of the files which fsck deems as corrupt and those files will be lost forever.

So before someone runs fsck with the option "-o nolog", please mount the fileystem as readonly and make a full backup first.   You can also run "fsck -o nolog" with the "-n" option to assess the damages that will be done to the filesystem before running the actual "fsck -o nolog" to remove the files.
***** NOTE NOTE NOTE ******


Fixing the filesystem by running fsck with "-o nolog" option.
 
# /opt/VRTS/bin/fsck -o full,nolog /dev/vx/rdsk/cvmdg/avol01
pass0 - checking structural files
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
Thu Aug  7 20:47:24 PDT 2014
fileset 999 directory 255055 block devid/blknum 0/85113 bad hash count 0 clear block? (ynq)y
fileset 999 directory 255064 block devid/blknum 0/85114 bad hash count 0 clear block? (ynq)y
fileset 999 directory 257071 block devid/blknum 0/85340 bad hash count 0 clear block? (ynq)y
fileset 999 directory 257080 block devid/blknum 0/85341 bad hash count 0 clear block? (ynq)y
fileset 999 directory 259087 block devid/blknum 0/85568 bad hash count 0 clear block? (ynq)y
.....
fileset 999 iau 28 summary incorrect - fix? (ynq)y
fileset 999 iau 29 summary incorrect - fix? (ynq)y
fileset 999 iau 30 summary incorrect - fix? (ynq)y
fileset 999 iau 31 summary incorrect - fix? (ynq)y
fileset 999 iau 32 summary incorrect - fix? (ynq)y
free block count incorrect 2078114 expected 1949204 fix? (ynq)y
free extent vector incorrect fix? (ynq)y
OK to clear log? (ynq)y
flush fileset headers? (ynq)y
set state to CLEAN? (ynq)y

The above fsck run should fix the filesystem inconsistency and clear the intent logs.    But due to a deficiency in fsck with the "-o nolog" option, the filesystem is fixed probably.   Trying to mount the fixed filesystem will cause VxFS to marked it a corrupt again and full fsck is required a again.


# mount -t vxfs -o cluster,rw /dev/vx/dsk/cvmdg/avol01 /avol01
UX:vxfs mount.vxfs: ERROR: V-3-26881: Cannot be mounted until it has been cleaned by fsck. Please run "fsck -t vxfs -y /dev/vx/rdsk/cvmdg/avol01" before mounting

/var/log/messages:
Aug  7 20:50:42 server101 kernel: vxfs: msgcnt 48 mesg 021: V-2-21: vx_fs_init - /dev/vx/dsk/cvmdg/avol01 file system validation failure

Cause

The problem only affects filesystem with invalid intent log entries in a Cluster File System (CFS) environment.   In a CFS environment, each cluster node will have its own intent log which is called per-node intent log.   The cause of the problem is that fsck with the "-o nolog" doesn't clear the per-node intent logs correctly and the corrupt intent log entries are not cleaned properly.

Resolution

Currently the only workaround is to zero out the intent logs manually.   Please contact Veritas technical support if you experience the same situation where "fsck -o nolog" fails to fix a filesystem.

Issue/Introduction

fsck -o nolog doesn't fix CFS file system with corrupt intent log