The I/O op is blocked while executing the vxresize to grow file system size.

book

Article ID: 100025257

calendar_today

Updated On:

Description

Error Message

We just got checked with below information:


$ sed -e '15078,15079!d' cmdlog
# 0, 46530812, Mon Jul 11 16:19:21 2011
/usr/lib/vxvm/bin/vxresize -g NPPAP_SDG1 logs_lv +20g
 
$ cat fsdb_histlog_logs

type 10007  "resize"  
time 1310368768 221763  >>>
kernel VxFS 5.0_REV-development (AIX 6.1)
oblks 335544320  nblks 356515840
devid 0
 
Let's look at the time using the adb debugger.
$ echo "0t1310368768=Y" | adb
                2011 Jul 11 16:19:28

Cause

So, why the 40+ "i/o pause" when growing this file system:
when growing a file system the fs is remounted as part of the operation
        i.e. "mount -o remount"” is performed under the covers of a fs grow
A remount is like a "EBUSY umount" in that it will flush all dirty pages to disk for the entire file system, and then remove all file data held in memory for files in this file system. in other words – on remount VxFS will flush and invalidate all pages associated with files for this file system.
This can take some times, whilst the remount is occurring the file system is frozen, it is thawed when it is completed
 

If this is CFS, then the remount can take longer as the flushing and invalidating is performed cluster wide (for all nodes cluster mounting this fs)
Two things are occurring here:
- the fs is frozen whilst the pages are flushed and invalidated, on all nodes on which the fs is mounted, all i/o is blocked whilst the fs is frozen (note: both read i/o and write i/o are blocked whilst the fs is frozen).
- the application files are now no longer in memory, subsequently to the fs-grow further read i/o or write i/o is required to bring file data back into memory. 

Resolution

We could strongly suspect the i/o pause” is occurring whilst the fs-grow operation has the file system is frozen due to the required remount (as part of the grow operation), and that the file system is frozen for longer because there are many pages to flush and invalidate – 40 seconds is an entirely feasible length of time required to perform a remount. Therefore, we suggested this customer should perform this kind of operation when the system load with active I/O is at low.


Applies To

SFHA CFS 5.1 SP1RP1 for AIX 6.1 TL5.

Here is the output of the logs file system that gathered the vxresize command is completed.

$ cat df_k | grep logs
/dev/vx/dsk/NPPAP_SDG1/logs_lv   356515840  97608778   73%   160994     1% /logs
$ cat df_os_logs | grep  files
/logs              (/dev/vx/dsk/NPPAP_SDG1/logs_lv):  195152944 blocks 26010880 files
 

The memory size is a 320 GB.

Issue/Introduction

According to their observation, The I/O operation is blocked to the /log cfs file system for 40 + seconds after they executed the vxresize command on CVM master node to grow the  the same cfs file system and then I/O kept continuing after that time.