There is no direct error within VCS/SFW HA.
However, in the vSphere console, at the time of failover, the VM representing node 1 will shut down, and when the failover begins, the VMDKs transition to node 2. The error condition is set when a VMDK fails to disconnect from the once-active node. This results in the storage remaining attached to both node 1 and node 2.
When attempting to restart node 1, the restart fails because of a resource conflict. The attached VMDK has already been claimed by another node and cannot be shared by two nodes simultaneously. The vSphere administrator must delete the VMDK from node 1 so that the node 1 VM can be restarted, after which node 2 can be successfully failed back. This is supposed to happen automatically, requiring no manual intervention.
The VMwareDisks agent has two optional attributes that allow for VMDKs to be automatically detached within a specific time frame. According to the Bundled Agent documentation, upon installation and configuration of the VMwareDisks agent, the check box to "Turn on vSphere HA" is enabled by default:
That configures IsVMHAEnabled, which is supposed to be set to 1 (true) by default:
| VMwareDisks agent optional attributes | |
| Optional attribute | Description |
| IsVMHAEnabled |
Defines whether vSphere HA is enabled. The Default is 1. |
| VMRegisterWait | |
However, IsVMHAEnabled remains set to false (0)
To resolve this, it is necessary to enable the IsVMHAEnabled attribute for all VMwareDisks resources.
1. Open the Cluster Management Console (VCS Java GUI).
2. Locate the VMwareDisks resource, right click and choose View > Properties View
3. Scroll down until the IsVMHAEnabled attribute is seen
4. Confirm it is set to TRUE. If not, select the edit icon and check the box to enable (as shown in Figure 1)
Figure 1
5. Perform steps 1-4 on all VMwareDisks resources in the cluster
Applies To
Windows Server 2008 or 2012 R2
SFW HA or VCS 6.0.1 or later
VCS VMwareDisks agent
vSphere 5.x on ESX