server reboot by itself dur to kernel mis-match

book

Article ID: 100011168

calendar_today

Updated On:

Description

Error Message

Nov 10 22:04:27 kernel: sr 370:0:0:0: Attached scsi generic sg137 type 5
Nov 10 22:06:06 kernel: hub 4-0:1.0: over-current change on port 2
Nov 10 22:06:06 kernel: hub 1-0:1.0: over-current change on port 6
Nov 10 22:06:14 kernel: usb 1-6.4: USB disconnect, address 112
Nov 10 22:34:33 syslogd 1.4.1: restart.
Nov 10 22:34:33 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Nov 10 22:34:33 kernel: Linux version 2.6.18-274.17.1.el5 (mockbuild@hs20-bc2-5.build.redh
at.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Wed Jan 4 22:45:44 EST 2012
Nov 10 22:34:33 kernel: Command line: ro root=/dev/vg00/rootvol01 rhgb quiet crashkernel=2
56M@16M
Nov 10 22:34:33 kernel: BIOS-provided physical RAM map:
 

Cause

OS kernel and VxVM and VxFS kernel did not match.

Looking at the kernel log, it appears they were having some kind of hardware problem with either their usb hub or the ports on the hub. The pattern below repeats over and over in the log:

hub 4-0:1.0: over-current change on port 2 hub 1-0:1.0: over-current change on port 6 usb 1-6.4: USB disconnect, address 111 usb 1-6: reset high speed USB device using ehci_hcd and address 3 usb 1-6.4: new high speed USB device using ehci_hcd and address 112 usb 1-6.4: configuration #1 chosen from 1 choice

usb-storage: device found at 112
usb-storage: waiting for device to settle before scanning
input: USB HID v1.11 Device [OEM Mass Storage plus] on usb-0000:00:1d.7-6.4
usb-storage: device scan complete
 
So, the kernel was configuring and un-configuring these devices over and over again.
 
The cache object that the USB code is working with is already freed:
 
SLAB              MEMORY            TOTAL ALLOCATED FREE
ffff811708559dc0 ffff811df8abc000      8          4     4
FREE / [ALLOCATED]
   ffff811df8abc000 (shared cache)
 [ffff811df8abc200]
   ffff811df8abc400 (cpu 2 cache)
 [ffff811df8abc600]
 [ffff811df8abc800]
   ffff811df8abca00 (shared cache)
   ffff811df8abcc00 (shared cache) <<-------------------- object is FREE
 [ffff811df8abce00]
 
All of the cache objects above, except for one, hold (struct vxbuf)'s so the fact that the contented buffer holds a vxbuf isn't unusual.
 

Resolution

Uninstall VxVM and VxFS:

# rpm -aq | grep vx
VRTSvxvm-5.1.132.000-SP1RP2_RHEL5
VRTSvxfen-5.1.132.000-SP1RP2_RHEL5
VRTSvxfs-5.1.132.000-SP1RP2_RHEL5

Now uninstalling:
# rpm -e --nodeps VRTSvxvm-5.1.132.000-SP1RP2_RHEL5
stopping vxrelocd
stopping vxattachd
stopping vxcached
stopping vxconfigbackupd
stopping vxnotify
stopping vxconfigd
removing /dev/vx
removing /etc/vx/reconfig.d

# rpm -e --nodeps VRTSvxfs-5.1.132.000-SP1RP2_RHEL5
checking:
# rpm -aq | grep vx
#
# rpm -ivh --nodeps VRTSvxvm-5.1.132.000-SP1RP2_RHEL5.x86_64.rpm
Preparing... ########################################### [100%]
warning: cannot get shared lock on /var/lib/rpm/Packages
warning: cannot get shared lock on /var/lib/rpm/Packages
warning: cannot get shared lock on /var/lib/rpm/Packages
stopping vxnotify
1:VRTSvxvm ########################################### [100%]
Installing file /etc/init.d/vxvm-boot
creating VxVM device nodes under /dev
Installing keys for APMs
# rpm -aq | grep vx
VRTSvxvm-5.1.132.000-SP1RP2_RHEL5
# rpm -ivh --nodeps VRTSvxfs-5.1.132.000-SP1RP2_RHEL5.x86_64.rpm
Preparing... ########################################### [100%]
1:VRTSvxfs ########################################### [100%]
warning: cannot get shared lock on /var/lib/rpm/Packages
Patching file "/etc/modprobe.conf"
Installing module init file "/etc/init.d/vxfs"
Warning: No modules found for 2.6.18-194.el5, using compatible modules for 2.6.18-128.el5.
# rpm -aq | grep vx
VRTSvxvm-5.1.132.000-SP1RP2_RHEL5
VRTSvxfen-5.1.132.000-SP1RP2_RHEL5
VRTSvxfs-5.1.132.000-SP1RP2_RHEL5
... found install-db present under /etc/vx/reconfig.d/state.d
# rm /etc/vx/reconfig.d/state.d/install-db
reboot
It came back working...

 


Applies To

bash-3.2# more redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
VRTSvxvm-5.1.132.000-SP1RP2_RHEL5
VRTSvxfs-5.1.132.000-SP1RP2_RHEL5

Issue/Introduction

The sever rebooted on 11/10/2013 around 22:30 PM CST