From the /var/log/messages file on the leader node
Jun 24 09:34:20 nodeA kernel: LLT INFO V-14-1-10205 link 1 (ens161) node 1 in trouble
Jun 24 09:34:22 nodeA kernel: LLT INFO V-14-1-10205 link 0 (ens256) node 1 in trouble
Jun 24 09:34:27 nodeA kernel: GAB INFO V-15-1-20036 Port b[VxFen (refcount 2)] gen fee60c membership 0
Jun 24 09:34:27 nodeA kernel: GAB INFO V-15-1-20036 Port a[GAB_Control (refcount 1)] gen fee609 membership 0
Jun 24 09:34:27 nodeA kernel: GAB INFO V-15-1-20036 Port d[GAB_LEGACY_CLIENT (refcount 0)] gen fee608 membership 0
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-80 RACER Node is: 0
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-100 Current LBOLT: 4306245438
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-87 Initiating VxFen Race
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-111 VxFen Pre-Race Delay: 0
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-119 LEADER Node : 0 is in current sub-cluster
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-88 RACER Node won the VxFen race
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-112 VxFen Post-Race Delay: 0
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-90 Sending WON_RACE
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-67 call to VM ioctl VOL_CLEAR_PR returned non-zero
Jun 24 09:34:27 nodeA kernel: VXFEN INFO V-11-1-84 Completed Fencing Operation.
When a network partition happens, one node in each sub-cluster is elected as the racer node, while the other nodes are designated as spectator nodes.
As majority-based fencing does not use coordination points, sub-clusters do not engage in an actual race to decide the winner after a split brain scenario.
The sub-cluster with the majority number of nodes survives while nodes in the rest of the sub-clusters are taken offline.
The following algorithm is used to decide the winner sub-cluster:
Considerations to decide cluster majority in the event of a network partition:
Odd number of cluster nodes in the current membership: One sub-cluster gets majority upon a network split.
Even number of cluster nodes in the current membership:
In case of an even network split, both the sub-clusters have equal number of nodes. The partition with the leader node is treated as majority and that partition survives.
In case of an uneven network split, such that one sub-cluster has more number of nodes than other sub-clusters, the majority sub-cluster gets majority and survives.
The following configuration setup and steps can be used to reproduce the design behaviour with a 2-node Cluster configuration when using majority-based I/O fencing:
The configuration consists of 2-nodes in a cluster.
[root@nodeA ~]# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A nodea RUNNING 0
A nodeb RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B SG1 nodea Y N ONLINE
B SG1 nodeb Y N OFFLINE
B cvm nodea Y N ONLINE
B cvm nodeb Y N ONLINE
The leader node in this instance is named "nodea" as it has the lowest node id:
[root@nodeA ~]# cat /etc/llthosts
0 nodea
1 nodeb
Sample main.cf:
[root@nodeA ~]# cat /etc/VRTSvcs/conf/config/main.cf
include "OracleASMTypes.cf"
include "types.cf"
include "CFSTypes.cf"
include "CRSResource.cf"
include "CSSD.cf"
include "CVMTypes.cf"
include "Db2udbTypes.cf"
include "MultiPrivNIC.cf"
include "OracleTypes.cf"
include "PrivNIC.cf"
include "SybaseTypes.cf"
cluster ia74clus (
UserNames = { admin = gNOgNInKOjOOmWOiNL }
Administrators = { admin }
UseFence = SCSI3
HacliUserLevel = COMMANDROOT
)
system nodea (
)
system nodeb (
)
group SG1 (
SystemList = { nodea = 0, nodeb = 1 }
AutoStartList = { nodea, nodeb }
)
Application app (
StartProgram = "/usr/bin/perl /root/application/start.pl"
StopProgram = "/usr/bin/perl /root/application/stop.pl"
MonitorProgram = "/usr/bin/perl /root/application/monitor.pl"
)
// resource dependency tree
//
// group SG1
// {
// Application app
// }
group cvm (
SystemList = { nodea = 0, nodeb = 1 }
AutoFailOver = 0
Parallel = 1
AutoStartList = { nodea, nodeb }
)
CFSfsckd vxfsckd (
)
CVMCluster cvm_clus (
CVMClustName = ia74clus
CVMNodeId = { nodea = 0, nodeb = 1 }
CVMTransport = gab
CVMTimeout = 200
)
CVMVxconfigd cvm_vxconfigd (
Critical = 0
CVMVxconfigdArgs = { syslog }
)
ProcessOnOnly vxattachd (
Critical = 0
PathName = "/bin/sh"
Arguments = "- /usr/lib/vxvm/bin/vxattachd root"
RestartLimit = 3
)
cvm_clus requires cvm_vxconfigd
vxfsckd requires cvm_clus
// resource dependency tree
//
// group cvm
// {
// ProcessOnOnly vxattachd
// CFSfsckd vxfsckd
// {
// CVMCluster cvm_clus
// {
// CVMVxconfigd cvm_vxconfigd
// }
// }
// }
Application Perl Scripts:
A sleep delay has been added to perl script which is responsible for stopping the application resource.
# ls /root/application/*.pl
/root/application/monitor.pl /root/application/start.pl /root/application/stop.pl
# cat /root/application/stop.pl
#!/usr/bin/perl
sleep(100);
$str = "rm -rf /tmp/sampleapp";
#rm -rf /tmp/sampleapp # add any steps, if required
#exit 0
system($str);
# cat /root/application/monitor.pl
#!/bin/sh
APPLICATION_IS_ONLINE=110
APPLICATION_IS_OFFLINE=100
if [ -f /tmp/sampleapp ] ; then # add any steps, if required
exit $APPLICATION_IS_ONLINE
else
exit $APPLICATION_IS_OFFLINE
fi
# cat /root/application/start.pl
#!/usr/bin/perl
system("touch /tmp/sampleapp");
Steps:
1. Stop the cluster on the leader node
[root@nodeA ~]# hastop -local
2. Disconnect the LLT links on the leader node
[root@nodeA ~]# lltstat -nvv active
LLT node information:
Node State Link Status Address
* 0 nodea OPEN
ens256 UP 00:50:56:05:DF:BC
ens161 UP 00:50:56:05:DF:BD
1 nodeb OPEN
ens256 UP 00:50:56:05:E0:41
ens161 UP 00:50:56:05:A3:1D
[root@nodeA ~]# lltconfig -u ens161 ; lltconfig -u ens256
NOTE: It is recommended that majority-based I/O fencing be implemented in clusters with an odd numbers of servers.
In a split brain scenario, arbitration is done based on 'majority' number of nodes among the sub-clusters.
Majority-based fencing:
If N is defined as the total number of nodes (current membership) in the cluster, then majority is equal to N/2 + 1.
Leader node:
The node with the lowest node ID in the cluster (before the split brain happened) is called the leader node.
This plays a role in case of a tie.