1. Locate the vmcore located in /var/crash.
ls -al /var/crash/127.0.0.1-2009-03-11-09\:27/
total 886280
drwx------ 2 root root 4096 Mar 11 09:32 .
drwxr-xr-x 5 netdump netdump 4096 Mar 11 09:31 ..
-rw------- 1 root root 2147127296 Mar 11 09:32 vmcore
Note: The symbols are included in the debug kernel, usually in the kernel-debuginfo rpm.
In Linux, the System.map file is a symbol table used by the kernel. A symbol table is a look-up between symbol names and their addresses in memory. A symbol name may be the name of a variable or the name of a function. The System.map is required when the address of a symbol name, or the symbol name of an address, is needed. It is especially useful for debugging kernel panics and kernel oops. The kernel does the address-to-name translation itself when CONFIG_KALLSYMS is enabled so that tools like ksymoops are not required.
2. Find out which kernel the customer is running. From the VRTSexplorer you can
# cat uname_a | awk '{print $3}'
$ cat uname_a | awk '{print $3}'
2.6.9-67.ELsmp
Note: You can also look at /etc/redhat-release but note that if there is a patched kernel this file may not get updated.
$ cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 6)
3. Locate a system with the same architecture either 32bit or 64bit or use the original system to review the core. The crash RPM will be required to review the core file. Use the following command to verify if crash is installed.
# rpm -aq | grep crash
4. Locate the debug kernel. For RHEL you can get more information from the following site.
Ex:
Debug kernel. From this rpm, we can extract the vmlinux related to the kernel.
kernel-debuginfo-2.6.32-696.1.1.el6.x86_64.rpm
Download the correct debug kernel files from the vendor directly. Be sure to match the correct kernel architecture and select the exact debug kernel for your kernel version.
5. Once the download of the debug kernel is complete you will have to install it. Use the following command to install the debug kernel rpm to a current directory.
#rpm2cpio <>.rpm | cpio -it | grep vmlinux --> to get the vmlinux info path
#rpm2cpio <>.rpm | cpio -idmv --> to extract the file from the rpm.
The debug vmlinux kernels will be in the following directory relative to the directory you have extracted.
./usr/lib/debug/lib/modules/2.6.9-67.ELsmp/vmlinux
6. Now that the debug kernel has been extracted. Run the crash utility to review the core.
Crash will take the following arguments
crash mapfile namelist dumpfile
mapfile = System.map
namelist = debug kernel
dumpfile = vmcore
Example:
# crash /root/tmp2/usr/lib/debug/lib/modules/2.6.9-67.ELsmp/vmlinux vmcore_case_id_281575592
crash 4.0-2.30
Copyright(C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright(C) 2004, 2005, 2006 IBM Corporation
Copyright(C) 1999-2006 Hewlett-Packard Co
Copyright(C) 2005 Fujitsu Limited
Copyright(C) 2005 NEC Corporation
Copyright(C) 1999, 2002 Silicon Graphics, Inc.
Copyright(C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /root/tmp2/usr/lib/debug/lib/modules/2.6.9-67.ELsmp/vmlinux
DUMPFILE: vmcore_case_id_281575592
CPUS: 4
DATE: Mon Apr 20 12:04:06 2009
UPTIME: 32 days, 02:45:47
LOADAVERAGE: 2.00, 2.00, 2.00
TASKS: 673
NODENAME: exxxx
RELEASE: 2.6.9-67.ELsmp
VERSION: #1 SMP Wed Nov 7 13:56:44 EST 2007
MACHINE: x86_64 (2833 Mhz)
MEMORY: 32.7GB
PANIC: "Oops: 0000 [1] SMP " (check log for details)
PID:12446
COMMAND: "MpxPeriodicCall"
TASK:10810d46030 [THREAD_INFO: 1080a6f2000]
CPU:1
STATE:TASK_RUNNING (PANIC)
Pertinent information that can be collected from the crash utility.
a. Gather a backtrace of the panic thread.
# echo bt | crash -s vmlinux vmcore
Example:
# echo bt | crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore
PID: 11469 TASK: 1005c906030 CPU: 2 COMMAND: "umount"
#0[10050407b10] start_disk_dump at ffffffffa0a3136d
#1[10050407b40] try_crashdump at ffffffff8014bd01
#2[10050407b50] do_page_fault at ffffffff80124143
#3[10050407bb0] vx_free at ffffffffa031bc1d
#4[10050407c30] error_exit at ffffffff80110d91
[exception RIP: vxg_api_range_unlockwf+7]
RIP:ffffffffa0bdaefb RSP: 0000010050407ce8 RFLAGS: 00010246
RAX:ffffffffa0bf2920 RBX: 0000000000000000 RCX:000001000000d000
RDX:00000100573bad10 RSI: 0000000000000000 RDI:0000000000000000
RBP:000001006477b000 R8: 0000000300000000 R9:0000000300000000
R10:0000010053f5c648 R11: 0000000000000000 R12:0000000000000000
R13:0000000000000001 R14: 000001007506b800 R15:00000000ffffffff
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#5[10050407d00] vx_glm_range_unlock at ffffffffa02cc729
#6[10050407d10] vx_glmrange_rangeunlock at ffffffffa02c9cf8
#7[10050407d20] vx_imntunlock at ffffffffa026633a
#8[10050407d30] vx_unmount at ffffffffa035b120
#9[10050407d80] generic_shutdown_super at ffffffff8017eb51
#10[10050407da0] kill_block_super at ffffffff8017f979
#11[10050407dc0] vx_kill_sb at ffffffffa0359c50
#12[10050407de0] atomic_dec_and_lock at ffffffff801ec9b5
#13[10050407e00] deactivate_super at ffffffff8017ea72
#14[10050407e20] umount_tree at ffffffff80193e59
#15[10050407e50] error_exit at ffffffff80110d91
#16[10050407e60] vfs_quota_off at ffffffff801a7a4a
#17[10050407e90] __down_write at ffffffff8030ad00
#18[10050407ed0] sys_umount at ffffffff80194251
#19[10050407ef0] sys_newstat at ffffffff80181d4c
#20[10050407f50] error_exit at ffffffff80110d91
RIP: 0000003dbebc6999 RSP: 0000007fbffff490 RFLAGS: 00010206
RAX: 00000000000000a6 RBX: ffffffff8011026a RCX:0000002a955ad218
RDX: 0000000000000008 RSI: 0000000000000002 RDI:0000000000610e90
RBP: 0000000000610e60 R8: 000000000000006e R9:0000003dbed2d8c0
R10: 0000000000000000 R11: 0000000000000202 R12:0000000000610e90
R13: 0000000000610eb0 R14: 0000000000000001 R15:0000007fbffff5b0
ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b
b. Gather a backtrace of all process and redirect to a file called tlist.out
# echo "foreach bt" | crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore > tlist.out
c. Gather the processes running at the time of the crash.
# echo "ps -t" |crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore > ps_t.out
d. Gather the message buffer prior to the crash.
# echo "msgbuf" | crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore > msgbuf.out
e. Gather a list of file systems mounted at the time of the crash.
# echo "mount" | crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore > mounts.out
f. Gather a list of loaded modules.
# echo "mod" | crash -s /data/debugs/2.6.9-42.ELsmp/vmlinux vmcore > modules.out