The system panicked with a VxFS or Veritas Oracle Disk Manager (ODM) kernel function in the stack.
In one customer case the system panicked after the following operations:
The above system panic shows the following panic stack.
vx_dio_physio vx_dio_rdwri fdd_write_end fdd_rw fdd_odm_rw odm_vx_io odm_io_start odm_io_req odm_io odm_io_stat odm_ioctl_ctl odm_ioctl_ctl_unlocked vfs_ioctl do_vfs_ioctl sys_ioctl system_call_fastpath
In another customer case, the system panic shows the following kernel stack.
vx_dev_strategy vx_snap_strategy vx_io_startnowait vx_nalloc_getpage_lnx vx_do_getpage vx_do_read_ahead vx_read_ahead vx_do_getpage vx_getpage1 vx_fault __do_fault handle_pte_fault handle_mm_fault __get_user_pages get_user_pages vx_dio_physio vx_dio_rdwri vx_write_direct vx_write1 vx_write_common_slow vx_write_common vx_write vfs_write sys_pwrite64 system_call_fastpath
On RHEL (Red Hat Enterprise Linux) 6 systems (up to and including RHEL 6.6) the kernel thread stack has a limited size of 8KB. This is increased to 16KB starting from RHEL 6.7. Each time a kernel thread calls a kernel function, some of its stack space will be used. The stack space will be returned when the kernel function returns back to the caller. When the combined stack space used by all the functions is over the limit, a stack overflow occurs and will cause system panic.
In some situations a kernel thread executing the VxFS kernel functions may already use up a large portion of the stack space, when this thread calls the lower layer functions, only a small amount of stack space is left. If the lower layer functions continues to use this small piece of stack space, the stack may overflow.
Upgrade the RHEL version to 6.7 (or above) and 7.1 (or above). The latest RHEL versions extended the kernel stack size to 16KB. This can eliminate almost all common stack overflow issues.
In case you can't upgrade to latest RHEL versions yet, ensure that the latest Veritas Storage Foundation patches are installed, especially for SF 6.0.5. There are incidents in SF 6.0.5 which can increase the kernel stack usage and increase the chance to hit the stack overflow issue. The Linux kernel I/O scheduler should also be changed to deadline. Refer to the following related article for details.
Article 000024448 - Linux Completely Fair Queuing (CFQ) I/O Scheduler configured on a system running SF may cause system panic due to kernel task stack overflow
There are two kernel parameters that can be used to resolve the two panics described in this article. By configuring these two parameters, a thread hand-off can be added before submitting the I/O to VxVM (Volume Manager) when there's not sufficient stack space left. These parameters are not run-time parameters. They can be set at the module load time only. They will only take affect if the VxFS module is unloaded and reloaded, or if the system is rebooted.
The following two module parameters need to be configured for this solution:
Set above VxFS kernel parameters using the vxfs.conf file as follows:
1. Create a vxfs.conf file inside /etc/modprobe.d directory.
touch /etc/modprobe.d/vxfs.conf
2. Copy the following lines into the vxfs.conf file.
options vxfs vxfs_io_proxy_vxvm=1
options vxfs vxfs_io_proxy_level=6144
The change will take affect when the system is rebooted, or the VxFS module is unloaded and reloaded.
VxFS (Veritas File System) is involved in a kernel panic caused by a stack overflow. The stack overflow is detected immediately by VxFS after the submitted I/O is returned from the lower layer. The stack overflow does not happen in VxFS, but somewhere in the lower layers.
Applies to: