Global Cluster heartbeat reports timeout every 3 minutes, reporting Event ID 18313 in the Application Event Log (TCP/IP connection via connector is hung)

book

Article ID: 100011169

calendar_today

Updated On:

Description

Error Message

%VCS_HOME%\log\engine_a.txt

VCS ERROR V-16-3-18313 TCP/IP connection via connector from cluster <clustername> to cluster <clustername> is hung; intentionally disconnecting. Auto-reconnect will occur however you may wish to examine the wac resource on system
VCS ERROR V-16-3-18311 <clustername> lost connection to cluster <clustername>
VCS INFO V-16-3-18302 Cluster <clustername> received connection from cluster <clustername>

Application Event log:

ERROR    18313(0x05df4789) Wac 
TCP/IP connection via connector from cluster <clustername> to cluster <clustername> is hung; intentionally disconnecting. Auto-reconnect will occur however you may wish to examine the wac resource on system <WAC IP>

Resolution

The following steps can be performed to keep these messages from occurring: 

1. Open Java Console

2. Go to Edit --> Configure Heartbeats

3. Select Existing Heartbeat and click the Configure button

4. Change Timeout to 60 (Default is 30 seconds) as shown in Figure 1

Figure 1

 

5. Offline the WAC resource (located in the ClusterService Group) as seen in Figure 2

Figure 2
User-added image

 

6. Right-click on the WAC resource again, and bring it back online on the same node it was just taken offline on (as shown in Figure 3):

Figure 3
User-added image
 

Note: Steps 5 and 6 must be performed on BOTH clusters.


7. Once the WAC resource has been restarted on both clusters, confirm errors are no longer logged every 3 minutes

Issue/Introduction

Global Cluster heartbeat reports timeout every 3 minutes. The following errors are reported to the system:

Additional Information

ETrack: 3361420 UMI: V-16-3-18313