# /opt/VRTS/bin/fsck /dev/vx/rdsk/adg/xvol
log replay in progress <<< fsck will stuck after displaying this message
Tracing the fsck process and it is found that the fsck continues to read the intent log. For example on Linux, use strace to capture what fsck is doing.
# ps -eaf | grep fsck
root 3993 22569 19 18:16 pts/0 00:00:14 /opt/VRTS/bin/fsck /dev/vx/rdsk/adg/xvol
# strace -o strace.fsck.out -p 3993
From the strace output file (strace.fsck.out), it is found that fsck continued to issue pread() to read the following locations.
pread(4, "\4\0\0\0\234j\16\0\1\0 \0\0\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024, 1006592) = 1024
pread(4, "\4\0\0\0\236j\16\0\1\0\240\0\2\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024, 1008640) = 1024
pread(4, "\4\0\0\0\212j\16\0\1\0 \0\2\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 991232) = 32768
The third argument in pread() is the byte offset to be read from the device. Converting the byte offset to the filesystem block location by dividing the byte offset by the file system block size. The file system block size can be obtained from the fstyp output (which reads the superblock of the filesystem). In the above example, the filesystem has a file system block size of 1024 byte.
# /opt/VRTS/bin/fstyp -v /dev/vx/rdsk/adg/xvol
vxfs
magic a501fcf5 version 13 ctime Mon 09 Jul 2018 03:03:04 PM AEST
volguid 5813d166-8335-11e8-8dfc-c660ef7874cc
logstart 0 logend 0
bsize 1024 size 1048576 dsize 1048576 ninode 0 nau 0 <<< bsize 1024 is the file system block size
....
Using the following command to get the list of locations read by fsck.
# grep pread strace.fsck.out | awk '{print $(NF-2)}' | sed 's/)//' | sort -bn | uniq -c | awk '{print $NF/1024, $0}'
968 12811 991232 <<< 968 is the fs block location, 12811 is the number of pread() calls, 991232 is the byte offset
973 12811 996352
980 12810 1003520
983 12811 1006592
985 12811 1008640
986 12811 1009664
We can check what kind of file system data the above file system blocks are using ncheck.
# /opt/VRTS/bin/ncheck -o block=968-986 /dev/vx/rdsk/adg/xvol
/dev/vx/rdsk/adg/xvol:
sectors(37) blocks(19)
----------------- -----------------
0/1936-0/1972 0/968-0/986
fileset fset mtch match devid/
name indx inode indx inode blocks name
---------- ---- ------ ---- ------ ------------- ------------------
STRUCTURAL 1 9 - 41 0/968-0/986
STRUCTURAL 1 41 - 9 0/968-0/986
So it is confirmed that fsck is looping in processing the intent log.
One possible cause of the looping of fsck is that the intent log ID's are not in continuous sequence. We can dump the intent log out using the VxFS file system debugger fsdb. For example,
# echo rfmtlog | /opt/VRTS/bin/fsdb /dev/vx/rdsk/adg/xvol > rfmtlog.out
The fsdb rfmtlog output contains records like the following.
In the above example, the first number is lgen (log generation number) and the second one is the last Group Done ID. Each time the file system is mounted, the lgen (log generation number) will be increased by 1. In case where the filesystem was mounted recently and there were not many transactions, the intent log may not be full and we may still find the intent log entries belong to the previous file system mount. We can ignore those entries with a lower lgen number.
We can print the last Group Done intent log entry using the following command.
# grep --no-group-separator -B 2 -A 1 'Group Done tranid: 944796' rfmtlog.out
00ec54a0: id 974481 func 103 ser 0 lser 0 len 332 lgen 4
Transaction type MULTIDONE
Group Done tranid: 944796 <<< last Group Done intent log ID
covering 352 bytes
In the above example, fsck will need to replay the intent log transaction starting from 944797. fsck expects the intent log ID's to be increased continously without gap starting from 944797. If there is any gap between the intent log entries starting from 944797, fsck will loop.
We can use the following script (checkoutseq_rfmtlog.sh) to find out if there ae any gaps in the intent log entries.
The issue has to be investigated by Veritas Technical Support before we can say if the intent log can be fixed. Please collect a metasave from the fileystem and contact Veritas Technote Support for assistance.
https://www.veritas.com/support/en_US/article.100023751
How to collect a metasave from a mounted Veritas (VxFS) file system