Poor replication performance after upgrading to SF 6.0.3 due to increased number of Rlink disconnecting due to ack timeout on update message

book

Article ID: 100011019

calendar_today

Updated On:

Description

Error Message

Excerpt of the messages file from the primary.

Oct  1 16:29:55 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:32:20 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:34:39 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:37:26 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:39:51 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:42:27 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
Oct  1 16:45:05 testp102 vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink rlk_testp1r9_rvg-isis disconnecting due to ack timeout on update message
 

Excerpt of the messages file from the secondary.

Oct 1 16:39:51 testp109 vxio: [ID 415634 kern.notice] NOTICE: VxVM VVR vxio V-5-0-854 Disconnecting rlink rlk_testp1r2_rvg-isis due to loss of TCP connection.

Oct 1 16:39:51 testp109 vxio: [ID 467919 kern.warning] WARNING: VxVM VVR vxio V-5-0-266 Rlink rlk_testp1r2_rvg-isis disconnected from remote

Oct 1 16:39:54 testp109 vxio: [ID 467569 kern.warning] WARNING: VxVM VVR vxio V-5-0-1406 Node 10.70.4.74 disconnected from node 10.109.232.32

Oct 1 16:39:57 testp109 vxio: [ID 584567 kern.notice] NOTICE: VxVM VVR vxio V-5-0-1402 Connected from node 10.70.4.74 to node 10.109.232.32

Oct 1 16:39:57 testp109 vxio: [ID 490591 kern.notice] NOTICE: VxVM VVR vxio V-5-0-265 Rlink rlk_testp1r2_rvg-isis connected to remote

Oct 1 16:39:57 testp109 vxio: [ID 960634 kern.notice] NOTICE: VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink rlk_testp1r2_rvg-isis

Oct 1 16:39:58 testp109 vxio: [ID 874180 kern.notice] NOTICE: VxVM VVR vxio V-5-0-1864 Secondary logging is disabled for RVG rvg-isis due to lower RVG version 30

Cause

If the secondary logging is not enabled that will lead to perfomance degradation. 

 

About secondary logging enabled asynchronous mode
Secondary logging is an advanced feature that improves replication performance throughput. This feature uses the Secondary SRL to stage the data before writing to the data volume. The data volume write requires write-order fidelity, which slows the replication throughput. Staging the data to the Secondary SRL allows an immediate acknowledgment of the data write, and allows the Primary to push more data. The Secondary applies the data to the volume in a parallel manner to maintain write-order fidelity. The data acknowledgment is performed immediately after writing on the SRL.
The requirements for automatically enabling support for Secondary logging are as follows:
  • The Primary SRL and the Secondary SRL volumes must be of the same size. The SRL can be resized using the vradmin resizesrl command.
  • The RVG version on both the Primary and the Secondary must be 40 or above. Thevxprint -Vl command displays the RVG version. The vxrvg upgrade rvg command can be used to upgrade the RVG version. The RVG version can be upgraded only when the RLINK is up to date.
  • The diskgroup version on both the Primary and the Secondary must be 170 or above. The vxdg list diskgroup command displays the diskgroup version. The vxdg upgrade diskgroup command can be used to upgrade the diskgroup version. Upgrading the diskgroup using this command automatically upgrades the RVG versions of all the RVG's in the diskgroup.
If any of these requirements are not met, the Secondary logging feature is automatically disabled, and traditional replication is used, which may provide lesser replication throughput.

Resolution

 

Upgrade both disk group and RVG version.

# vxdg upgrade

Then verify both versions
# vxdg list | grep version
# vxprint -g  -l | grep version
 
If not ugrade the RVG version
# vxrvg -g upgrade

Issue/Introduction

Poor replication performance noticed after upgrading to SF 6.0.3. The performance impact was due to frequent VVR Rlink disconnects happening through out the day. Checking the syslog/messages logs show large number of Rlinks disconnects due to ack timeout after SF upgrade. After upgrading to 6.0.3, VVR Secondary logging gets disabled if the RVG version is below 40. This seems to be causing the problem.