Systems running File system (VxFS) 6.0.1/6.0.3 may panic due to kernel stack overflow

book

Article ID: 100010912

calendar_today

Updated On:

Description

Error Message

System dump would have below stack


#0 [ffff883ff8fdc110] machine_kexec at ffffffff81035c0b
#1 [ffff883ff8fdc170] crash_kexec at ffffffff810c0dd2
#2 [ffff883ff8fdc240] oops_end at ffffffff81511680
#3 [ffff883ff8fdc270] no_context at ffffffff81046bfb
#4 [ffff883ff8fdc2c0] __bad_area_nosemaphore at ffffffff81046e85
#5 [ffff883ff8fdc310] bad_area at ffffffff81046fae
#6 [ffff883ff8fdc340] __do_page_fault at ffffffff81047760
#7 [ffff883ff8fdc460] do_page_fault at ffffffff815135ce
#8 [ffff883ff8fdc490] page_fault at ffffffff81510985
[exception RIP: print_context_stack+173]
RIP: ffffffff8100f4dd RSP: ffff883ff8fdc548 RFLAGS: 00010006
RAX: 00000010ffffffff RBX: ffff883ff8fdc6d0 RCX: 0000000000002755
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
RBP: ffff883ff8fdc5a8 R8: 000000000002072c R9: 00000000fffffffb
R10: 0000000000000001 R11: 000000000000000c R12: ffff883ff8fdc648
R13: ffff883ff8fdc000 R14: ffffffff81600460 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff883ff8fdc540] print_context_stack at ffffffff8100f4d1
#10 [ffff883ff8fdc5b0] dump_trace at ffffffff8100e4a0
#11 [ffff883ff8fdc650] show_trace_log_lvl at ffffffff8100f245
#12 [ffff883ff8fdc680] show_trace at ffffffff8100f275
#13 [ffff883ff8fdc690] dump_stack at ffffffff8150d3ca
#14 [ffff883ff8fdc6d0] warn_slowpath_common at ffffffff8106e2e7
#15 [ffff883ff8fdc710] warn_slowpath_null at ffffffff8106e33a
#16 [ffff883ff8fdc720] hrtick_start_fair at ffffffff810575eb
#17 [ffff883ff8fdc750] pick_next_task_fair at ffffffff81064a00
#18 [ffff883ff8fdc7a0] schedule at ffffffff8150d908
#19 [ffff883ff8fdc860] __cond_resched at ffffffff81064d6a
#20 [ffff883ff8fdc880] _cond_resched at ffffffff8150e550
#21 [ffff883ff8fdc890] vx_nalloc_getpage_lnx at ffffffffa041afd5 [vxfs]
#22 [ffff883ff8fdca80] vx_nalloc_getpage at ffffffffa03467a3 [vxfs]
#23 [ffff883ff8fdcbf0] vx_do_getpage at ffffffffa034816b [vxfs]
#24 [ffff883ff8fdcdd0] vx_do_read_ahead at ffffffffa03f705e [vxfs]
#25 [ffff883ff8fdceb0] vx_read_ahead at ffffffffa038ed8a [vxfs]
#26 [ffff883ff8fdcfc0] vx_do_getpage at ffffffffa0347732 [vxfs]
#27 [ffff883ff8fdd1a0] vx_getpage1 at ffffffffa034865d [vxfs]
#28 [ffff883ff8fdd2f0] vx_fault at ffffffffa03d4788 [vxfs]
#29 [ffff883ff8fdd400] __do_fault at ffffffff81143194
#30 [ffff883ff8fdd490] handle_pte_fault at ffffffff81143767
#31 [ffff883ff8fdd570] handle_mm_fault at ffffffff811443fa
#32 [ffff883ff8fdd5e0] __get_user_pages at ffffffff811445fa
#33 [ffff883ff8fdd670] get_user_pages at ffffffff81144999
#34 [ffff883ff8fdd690] vx_dio_physio at ffffffffa041d812 [vxfs]
#35 [ffff883ff8fdd800] vx_dio_rdwri at ffffffffa02ed08e [vxfs]
#36 [ffff883ff8fdda20] vx_write_direct at ffffffffa044f490 [vxfs]
#37 [ffff883ff8fddaf0] vx_write1 at ffffffffa04524bf [vxfs]
#38 [ffff883ff8fddc30] vx_write_common_slow at ffffffffa0453e4b [vxfs]
#39 [ffff883ff8fddd30] vx_write_common at ffffffffa0454ea8 [vxfs]
#40 [ffff883ff8fdde00] vx_write at ffffffffa03dc3ac [vxfs]
#41 [ffff883ff8fddef0] vfs_write at ffffffff81181078
#42 [ffff883ff8fddf30] sys_pwrite64 at ffffffff81181a32
#43 [ffff883ff8fddf80] system_call_fastpath at ffffffff8100b072

 

Cause

This issue is tracked via etrack incident numbmer mentioned in the below Supplemental Materials section.

While doing direct I/O writes, user-space pages need to be pre-faulted using __get_user_pages() code path. This code path is very deep and can end up consuming lot of stack space resulting in the stack overflow, causing corruption in thread_info structure.

When kernel refers to this corrupted thread_info structure, via the scheduler, the system is paniced.

 


 

 

Resolution

Veritas has made changes to reduce the kernel stack consumption in this code path to avoid stack overflow.

This issue is fixed in private hot fix patch VxFS 6.0.3.103.


(for RHEL6 this issue is fixed from VxFS 6.0.3.102 patch onwards)  

Please contact Veritas Technical Support to obtain this private hotfix patch.
 

 


Applies To

RHEL 5/6 and SLES 10/11 systems running SFHA version 6.0.1 or 6.0.3.

 

Issue/Introduction

Systems running File system (VxFS) version 6.0.1/6.0.3 may panic due to kernel stack overflow with below stack trace.  

Additional Information

ETrack: 3331419