From engine logs:
2012/04/28 19:27:45 VCS NOTICE V-16-1-10300 Initiating Offline of Resource IP_test1 (Owner: Unspecified, Group: test1) on System rh5u6n02
2012/04/28 19:27:46 VCS INFO V-16-1-10305 Resource IP_test1 (Owner: Unspecified, Group: test1) is offline on rh5u6n02 (VCS initiated)
2012/04/28 19:27:46 VCS NOTICE V-16-1-10446 Group test1 is offline on system rh5u6n02
2012/04/28 19:28:40 VCS ERROR V-16-2-13067 (rh5u6n02) Agent is calling clean for resource(IP_test2) because the resource became OFFLINE unexpectedly, on its own.
Different netmask entries on base Bonded NIC and VIPs.
ifconfig -a output:
bond0 Link encap:Ethernet HWaddr 00:50:56:8D:01:DD
inet addr: x.x.x.12 Bcast:10.208.19.255 Mask:255.255.252.0 <<<<<<<<
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:16153410 errors:0 dropped:0 overruns:0 frame:0
TX packets:161380 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3240547042 (3.0 GiB) TX bytes:24031736 (22.9 MiB)
bond0:0 Link encap:Ethernet HWaddr 00:50:56:8D:01:DD
inet addr:x.x.x.89 Bcast:0.0.0.0 Mask:255.255.255.0 <<<<<<<<
bond0:1 Link encap:Ethernet HWaddr 00:50:56:8D:01:DD
inet addr:x.x.x.90 Bcast:0.0.0.0 Mask:255.255.255.0 <<<<<<<<
bond0:2 Link encap:Ethernet HWaddr 00:50:56:8D:01:DD
inet addr:x.x.x.91 Bcast:0.0.0.0 Mask:255.255.255.0 <<<<<<<<
From debug logs:
2012/04/28 21:37:27 VCS DBG_1 V-16-50-0 IP:IP_test1:monitor:device bond0 address x.x.x.89 netmask 255.255.255.0
IP.C:ip_monitor[200]
2012/04/28 21:37:27 VCS DBG_1 V-16-50-0 IP:IP_test1:monitor:Number of Interfaces: 6
IP.C:ip_monitor[228]
2012/04/28 21:37:27 VCS DBG_5 V-16-50-0 IP:IP_test1:monitor:Gathering status of device bond0
IP.C:ip_monitor[249]
2012/04/28 21:37:27 VCS DBG_5 V-16-50-0 IP:IP_test1:monitor:Interface bond0 address does not match <<<<<<<
Description:
So, on this setup, the bond0 interface hosts two networks - x.x.x.12 /22 (via bond0) and x.x.x.89/24 (via bond0:0).
Note that since bond0:0 is the first IP for x.x.x.89/24 network, it becomes primary IP for that network. All other VIPs on bond0:1 through
bond0:2 become secondary IPs for x.x.x.89/24 network. As per Linux operating system's network design, whenever a primary IP for a
network is removed from an interface, all secondary IPs on that network that are plumbed on the same device are automatically removed.
So when the customer attempted to offline the IP resource that was online on bond0:0, the operating system removed all secondary VIPs automatically. Due to this, all other IP resources reported FAULT.
The resolution to this issue is to correct the netmask of bond0 interface. You can get it achieved via following steps:
1. Offline all IP resources
2. Either correct the bond0 configuration (possibly within the ifcfg-
3. Restart network services to get proper config on bond0.
4. Once the NIC resource detects it as ONLINE, you can online all IP resources.
With this, the bond0 will host primary IP and all IP resources go online as secondary IPs for the same network. So offline of any VIP will not affect any other VIP.
Applies To
rhel 5u6 with SFHA 5.1SP1GA