VxFS consumes a lot of CPU resource due to excessive sync(2) operations

book

Article ID: 100015716

calendar_today

Updated On:

Description

Error Message

For example on the Linux platform, the "top" command may show VxFS kernel threads using a lot of  CPU resources.  

 

top - 12:14:18 up 6 min, 5 users, load average: 4.18, 2.18, 0.93
Tasks: 344 total, 16 running, 327 sleeping, 0 stopped, 1 zombie
%Cpu(s): 0.5 us, 45.8 sy, 0.0 ni, 53.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st
KiB Mem : 8011292 total, 3342152 free, 620652 used, 4048488 buff/cache
KiB Swap: 3145724 total, 3145724 free, 0 used. 6578576 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
965 root 20 0 0 0 0 R 10.4 0.0 0:02.99 vx_worklist_thr <<< %CPU may show a high number when the problem occurs 
968 root 20 0 0 0 0 R 10.0 0.0 0:02.47 vx_worklist_thr 
964 root 20 0 0 0 0 R 7.6 0.0 0:02.85 vx_worklist_thr 
969 root 20 0 0 0 0 R 7.2 0.0 0:02.32 vx_worklist_thr 
967 root 20 0 0 0 0 R 7.0 0.0 0:02.22 vx_worklist_thr 
974 root 20 0 0 0 0 R 7.0 0.0 0:02.50 vx_worklist_thr 
970 root 20 0 0 0 0 R 6.8 0.0 0:02.76 vx_worklist_thr 
975 root 20 0 0 0 0 R 6.6 0.0 0:02.94 vx_worklist_thr 
963 root 20 0 0 0 0 R 6.4 0.0 0:02.56 vx_worklist_thr 
966 root 20 0 0 0 0 R 6.4 0.0 0:01.96 vx_worklist_thr 
972 root 20 0 0 0 0 R 6.4 0.0 0:02.84 vx_worklist_thr 
976 root 20 0 0 0 0 R 5.2 0.0 0:02.38 vx_worklist_thr 
971 root 20 0 0 0 0 R 5.0 0.0 0:02.24 vx_worklist_thr 
973 root 20 0 0 0 0 R 4.4 0.0 0:02.17 vx_worklist_thr 
3736 root 20 0 116924 3612 1680 S 2.2 0.0 0:01.04 bash

 

Cause

In order to process sync() system calls vxfs scans through inode cache which is a costly operation. If an user application is issuing excessive sync() calls and there are VxFS file systems mounted, this can cause the VxFS sync processing thread to consume high CPU.


Resolution

The code has been modified to combine all the sync() requests issues in the last 60 seconds into a single request.

The fix is present in the following hotfixes for the Linux platform:


fs-sles12_x86_64-HotFix-6.2.1.003
fs-sles11_x86_64-HotFix-6.2.1.003
fs-rhel7_x86_64-HotFix-6.2.1.003
fs-rhel6_x86_64-HotFix-6.2.1.003

Please contact Veritas Technical Support if the above hotfixes are required.

The fix will be ported to future official VxFS patch releases on all platforms.

Workaround
=========
Before the patch can be applied, please avoid running programs that issue a lot of sync(2) operations in a short period of time.  

For example on Linux instead of running the "partprobe" command, the "blockdev --rereadpt" command can be used.  The "blockdev --rereadpt" operates only on the specified OS device and so it will not issue any sync(2) system call.  

For example, a disk partition table is modified by running fdisk.

# fdisk /dev/vdo
.....
Command (m for help): d
Partition number (1-8): 8

Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Warning: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) Syncing disks.

# egrep 'vdo' /proc/partitions
252 224 4194304 vdo
252 227 4190208 vdo3
252 232 4190208 vdo8


Instead of running partprobe, "blockdev --rereadpt" can be used.

# blockdev --rereadpt /dev/vdo

# egrep 'VxDMP3|vdo' /proc/partitions
252 224 4194304 vdo
252 227 4190208 vdo3


 

Issue/Introduction

When a lot of sync(2) operations are issued within a short period of time, Veritas File System (VxFS) may consume a lot of CPU resources.

Additional Information

ETrack: 3633683