No errors reported but multiple hb processes are observed:
# ps auxH | awk '$8 ~ /^D/{print}'
root 2488 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/0]
root 2490 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/1]
root 2491 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/2]
root 2492 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/3]
root 2493 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/4]
root 2494 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/5]
root 2495 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/6]
root 2496 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/7]
root 2497 0.0 0.0 0 0 ? D 04:00 0:00 [llt_hb/8]
This is a result of a change required for VMWare vmotion / snapshots. The LLT heartbeat thread was not getting sufficient cycles to run resulting in LLT packet send/receive failures. This would eventually result in nodes being ejected from the cluster due to missing heartbeats.
One heartbeat thread is created per CPU so increase the efficiency of the heartbeat.
Since this is only required for VMware environments, this feature can be disabled.
a. Verify the feature is enabled:
# lltconfig -H query
Current LLT miscellaneous values:
sleepalloc = 0
hbthread = 1
b. Disable the feaure:
# lltconfig -H hbthread:0
To enable or disable permanently append the following line entry to the /etc/llttab file.
To enable:
# vi /etc/llttab
set-misc hbthread:1
To disable:
# vi /etc/llttab
set-misc hbthread:0
NOTE: The change will be effective from the next cluster start.
05:29:04 up 18 min, 1 user, load average: 7.17, 3.19, 1.29
# grep processor /proc/cpuinfoprocessor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
processor : 6
processor : 7
# uptime05:29:18 up 18 min, 1 user, load average: 7.43, 3.44, 1.40
# uptime
05:29:25 up 18 min, 1 user, load average: 7.52, 3.59, 1.4705:29:27 up 18 min, 1 user, load average: 7.52, 3.59, 1.4705:34:57 up 23 min, 1 user, load average: 8.00, 6.54, 3.43
# uptime05:34:59 up 23 min, 1 user, load average: 8.00, 6.54, 3.43