VxFS fsck will loop (hang) in replaying the intent log if the intent log ID's are not in a continuous sequence

book

Article ID: 100044158

calendar_today

Updated On:

Description

Error Message

# /opt/VRTS/bin/fsck /dev/vx/rdsk/adg/xvol
log replay in progress        <<< fsck will stuck after displaying this message

Tracing the fsck process and it is found that the fsck continues to read the intent log.   For example on Linux, use strace to capture what fsck is doing.

# ps -eaf | grep fsck
root      3993 22569 19 18:16 pts/0    00:00:14 /opt/VRTS/bin/fsck /dev/vx/rdsk/adg/xvol

# strace -o strace.fsck.out -p 3993

From the strace output file (strace.fsck.out), it is found that fsck continued to issue pread() to read the following locations.

pread(4, "\4\0\0\0\234j\16\0\1\0 \0\0\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024, 1006592) = 1024
pread(4, "\4\0\0\0\236j\16\0\1\0\240\0\2\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024, 1008640) = 1024
pread(4, "\4\0\0\0\212j\16\0\1\0 \0\2\0\3\0\1\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 991232) = 32768

The third argument in pread() is the byte offset to be read from the device.   Converting the byte offset to the filesystem block location by dividing the byte offset by the file system block size.   The file system block size can be obtained from the fstyp output (which reads the superblock of the filesystem).  In the above example, the filesystem has a file system block size of 1024 byte.

# /opt/VRTS/bin/fstyp -v /dev/vx/rdsk/adg/xvol
vxfs
magic a501fcf5  version 13  ctime Mon 09 Jul 2018 03:03:04 PM AEST
volguid 5813d166-8335-11e8-8dfc-c660ef7874cc
logstart 0  logend 0
bsize  1024 size  1048576 dsize  1048576  ninode 0  nau 0        <<< bsize 1024 is the file system block size
....

Using the following command to get the list of locations read by fsck.

# grep pread strace.fsck.out  | awk '{print $(NF-2)}' | sed 's/)//' | sort -bn | uniq -c | awk '{print $NF/1024, $0}'
968   12811 991232         <<< 968 is the fs block location, 12811 is the number of pread() calls, 991232 is the byte offset
973   12811 996352
980   12810 1003520
983   12811 1006592
985   12811 1008640
986   12811 1009664

We can check what kind of file system data the above file system blocks are using ncheck.

# /opt/VRTS/bin/ncheck -o block=968-986 /dev/vx/rdsk/adg/xvol
/dev/vx/rdsk/adg/xvol:
sectors(37)             blocks(19)
-----------------       -----------------
0/1936-0/1972           0/968-0/986
fileset    fset        mtch  match devid/
name       indx  inode indx  inode blocks        name
---------- ---- ------ ---- ------ ------------- ------------------
STRUCTURAL    1      9    -     41 0/968-0/986          <<< intent log
STRUCTURAL    1     41    -      9 0/968-0/986  

So it is confirmed that fsck is looping in processing the intent log.

Cause

One possible cause of the looping of fsck is that the intent log ID's are not in continuous sequence.  We can dump the intent log out using the VxFS file system debugger fsdb.  For example,

# echo rfmtlog | /opt/VRTS/bin/fsdb /dev/vx/rdsk/adg/xvol > rfmtlog.out

The fsdb rfmtlog output contains records like the following.

00000000: id 944326  func 1  ser 2  lser 3  len 224  fser 1  lfser 1  lgen 4
      Partial log entry
 
00000100: id 944326  func 2  ser 3  lser 3  len 66  lgen 4
Transaction type DATRAN
directory fset 999  ilist 0  inode 104557  dev/bno 0/42234  blen 1024  boff 292
previous  d_ino 105161  d_reclen 748  d_namlen 4  d_hashnext 0000
added  d_ino 105162  d_reclen 732  d_namlen 4  d_hashnext 0000
               0  6  0  4
....
00059c60: id 944714  func 103  ser 0  lser 0  len 396  lgen 4
Transaction type MULTIDONE
Group Done tranid: 919417     <<< Group Done intent log ID
 covering 416 bytes
 
fsck will need to replay the intent log starting from the log entry following the last Group Done transaction.   We can find out the last Group Done ID by using the following command.

# grep --no-group-separator  -B 2 'Group Done tranid' rfmtlog.out | egrep 'lgen|Group Done' | awk '{print $NF}' | xargs -n 2 | sort -bn -k 1,2 | tail -1
4 944796              <<< 4 is lgen and 944796 is the last Group Done intent log ID

In the above example, the first number is lgen (log generation number) and the second one is the last Group Done ID.    Each time the file system is mounted, the lgen (log generation number) will be increased by 1.  In case where the filesystem was mounted recently and there were not many transactions, the intent log may not be full and we may still find the intent log entries belong to the previous file system mount.  We can ignore those entries with a lower lgen number.

We can print the last Group Done intent log entry using the following command.

# grep --no-group-separator  -B 2 -A 1 'Group Done tranid: 944796' rfmtlog.out
00ec54a0: id 974481  func 103  ser 0  lser 0  len 332  lgen 4
Transaction type MULTIDONE
Group Done tranid: 944796        <<< last Group Done intent log ID
 covering 352 bytes

In the above example, fsck will need to replay the intent log transaction starting from 944797.  fsck expects the intent log ID's to be increased continously without gap starting from 944797.   If there is any gap between the intent log entries starting from 944797, fsck will loop.

We can use the following script (checkoutseq_rfmtlog.sh) to find out if there ae any gaps in the intent log entries.

# cat checkoutseq_rfmtlog.sh

#!/bin/bash
#set -x
USAGE="checkoutseq "
if (($# < 1))
then
        echo "$USAGE"
        exit 1
fi
TMPFILE=$(mktemp)     # to get around the "PIPE while" issue in bash
((FIRST = 1))
((PREV = 0))
grep ': id ' $1 | awk '{print $3, $NF}' | uniq | while read id lgen
do
        if ((FIRST == 1))
        then
                echo $id $lgen
                ((PREV = id))
                PREV_LGEN=$lgen
                ((FIRST = 0))
                continue
        fi
        if [ "$lgen" != "$PREV_LGEN" ] || (( (id - PREV) != 1 ))
        then
                echo $PREV $PREV_LGEN
                echo
                echo $id $lgen
        fi
        ((PREV = id))
        PREV_LGEN=$lgen
        echo "$PREV $PREV_LGEN" > $TMPFILE
done
cat $TMPFILE

For example,
 
# checkoutseq_rfmtlog.sh rfmtlog.out.bad
944326 4
944799 4
                      <<<< breaks here; intent log ID 944800 is missing and it is greater than last Group Done ID 944796
944801 4
944801 4
                      <<<< breaks here again; intent log iD goes backward by 1 !!!
944800 4
974481 4
                      <<<< this break is okay; 974481 is the last log entry written by VxFS before overwriting the old log entry with ID 942963
942963 4
944326 4       <<<< This is the physical end of the intent log.  The log entries started to be written from the physical beginning of the log again (see the first log ID above which is also 944326).

Resolution

The issue has to be investigated by Infoscale Technical Support before we can say if the intent log can be fixed. 

Please collect a metasave from the fileystem and contact Infoscale Technote Support for assistance.

Issue/Introduction

Veritas File System (VxFS) fsck (file system check tool) will loop (hang) in replaying the intent log if the intent log ID's are not in a continuous sequence.