ls or cp commands hang on cluster filesystem

book

Article ID: 100009120

calendar_today

Updated On:

Description

Error Message

The ls command process thread shows the following stack which indicate it is trying to get lock of a DB file inode to read attributes.

(0)> f 613 pvthread+026500 STACK: [000D50D0]e_block_thread+000290 () [000A472C]delay+00012C (??) [00014FAC].kernel_replace_gate_cstack+00002C () [0496FB58].vx_delay+000010 () [04987BB0].vx_get_ownership_try+000210 () [04A723F0].vx_getattr2+000368 () [04A75348].vx_do_getattr@AF102_71+000010 () [04A753AC].vx_getattr@AF103_72+00003C () [0497A2A4].vx_getattr_skey+000034 () [0057EAC0]vnop_getattr_flags+000040 (??, ??, ??) [0059CC68]statx+000208 (??, ??, ??, ??) [00003850]ovlya_addr_sc_flih_main+000130 () [D0132394]lstat64x+000014 (??, ??) [10003A48]gstat+000128 (??, ??, ??) [100032A8]readdirs+000128 (??) [10001CD4]pdirectory+000074 (??, ??, ??) [100026DC]main+0004FC (??, ??) [10000168]__start+000068 () [kdb_read_mem] no real storage @ FFFFFFFFFFF97D0

 

 

At the same time, we can see ODM is trying to resize the DB file.

 

(0)> f 530 pvthread+021200 STACK: [000D50D0]e_block_thread+000290 () [000D5D28]e_sleep_thread+0000E8 (??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [04994B90].vx_event_wait+0000CC () [049835E0].vx_async_waitmsg+00001C () [0498357C].vx_msg_send+0001A8 () [049DB47C].vx_extset_msg+00037C () [04B6E488].vx_cfs_extset+000040 () [04AD4D54].vx_extset+0002D4 () [04AD7BD4].vx_setext+000190 () [04B1BE38].vx_uioctl+0004AC () [04A6E720].vx_ioctl+0000D0 () [0497A0C0].vx_ioctl_skey+00004C () [00586070]vnop_ioctl+000050 (??, ??, ??, ??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [04A6E888].vx_vop_ioctl+00001C () [04B713D4].vx_odm_resize@AF15_6+00015C () [04B71024].vx_odm_resize+000030 () [F1000000C04435C4].odm_vx_resize+000040 () [F1000000C043EAAC].odm_resize+0000E8 () [F1000000C042D48C].vxodmioctl+00018C ()

Cause

When ODM_resize tries to look for extents to grow a file, if the filesystem is badly fragmented and nearly full, it would take a very long time to finish.

This can cause all other filesystem commands, like ls or cp, to wait for a long time to finish. Also, any applications trying to grow a file in corresponding

The filesystem would seem to be hung.

Resolution

The solution is to make sure the filesystem has enough space and is not fragmented.

Try running defrag (fsadm -e) regularly.

 

 

Applies To

Cluster filesystems which are badly fragmented and with applications that auto-grow the files in the filesystem.

Issue/Introduction

ls -l or cp commands are hung on CFS system. This issue only occurs in the below conditions. 1. The filesystem is very badly fragmented. 2. Filesystem is nearly full, (above 80% usage) 3. A large file is created or an application using the filesystem is growing a file in the corresponding filesystem. Eg. Oracle to auto-grow the datafiles.

Additional Information

ETrack: 2871901