This document attempts to explain how to tune the SRL "vol_max_rdback_sz" (RDBACK) tunable.
In shared environments such as CVM (shared diskgroups) the SRL is accessed extensively, so the RDBACK pool may need to be increased to a higher than normal value to ensure the replication link can cope with the increase in workload at peak operational times.
IMPORTANT: In a shared environment, VVR always “READS BACK” from the SRL when replicating in asynchronous mode.
Figure 1.0

In asynchronous mode, where the Secondary or network bandwidth cannot keep with the incoming write rate, the Primary kernel memory buffers fills up.
For VVR to continue to provide memory for incoming writes and continue its processing, it must free the memory held by writes that have been written to the Primary data volume, but, not yet sent to the Secondary.
When VVR is ready to send the unsent writes that were freed, the writes must first “READ BACK” from the SRL.
In synchronous mode the data is always available in memory, while in asynchronous mode VVR may have to FREQUENTLY “READ BACK” the data from SRL. Synchronous replication can significantly decrease application performance by adding the network round trip to the latency of each write request.
Consequently, replication performance might suffer because of the delay of the additional read operation.
KEY POINT:
VVR does not need to “READ BACK” from the SRL if the “NETWORK BANDWIDTH” is sufficient and the Secondary always keeps up with the incoming write rate, or if the Secondary only falls behind for short periods during which the accumulated writes are small enough to fit in the VVR kernel buffer.
If VVR reads back from the SRL frequently, striping the SRL volume over several self-contained (not used by data volumes) disk could improve performance, unless already done at the array level.
To determine whether VVR is reading back from the SRL, use the “vxstat” command. In the output, note the number of read operations on the SRL.
vrstat command
The vrstat command prints statistical information for the volumes in Replicated Volume Groups (RVGs) and RLINKs, and for all hosts in a Replicated Data Set (RDS). Information is displayed across the RDS setup on all the hosts, and not for a specific host.
By default, the command displays statistics at intervals of 10 seconds. This interval can be changed by setting the VRAS_STATS_FREQUENCY environment variable to the required value in the /etc/vx/vras/vras_env file.
If no rvg argument is specified, the command displays information for the RLINKs, storage replicator logs (SRLs), data volumes or memory tunables across all the RDSs on the local host, depending on the option that is specified.
If no option or argument is specified, the vrstat command displays the consolidated status for the RLINKs, SRLs, data volumes and memory tunables of all RDSs on the local host.
The "-M" option with the vrstat command can be used to display detailed information for the memory tunables on every host in an RDS. The output from this option is similar to that from the vxmemstat command.
Scenario:
In this instance the replication state stalls during a peak operating window, and as result the SRL starts to fill up as the SRL updates are not being replicated to the secondary site, note the waiting state for the RDBCK value associated with RBCK-datadg_rvg.
Sample output:
# vrstat -M
Fri Aug 2 22:20:57 2013 Replicated Data Set prod_rvg:
Fri Aug 2 22:20:59 2013
Replicated Data Set datadg_rvg:
Memory-pool Statistics:
Host Pool DG Min Max In Allocated Max Waiting
Size Size Use Used
------ ---------- ---- ------ ------ ----- ---------- ----- -------
Barney WRSHIP - 1024 65536 0 1024 0 no
Barney RDBCK-prodg_rvg - 1024 1048576 660 1024 972 no
Barney NMCOM-prodg_rvg - 1024 262144 972 1024 1012 no
Barney RDBCK-datadg_rvg - 1024 1048576 1048320 1048580 1048320 yes <<<< WAITING
Barney NMCOM-datadg_rvg - 1024 262144 262140 262140 262140 no
The tuning process may require multuple amendments to the " vol_max_rdback_sz" ( RDBACK) until a satisfactory level has been reached to cope with the peak operating level.
The RDBACK (vol_max_rdback_sz) pool can be increased to 1073741824 (1024M) using the vxtune command as shown below:
# vxtune vol_max_rdback_sz
Tunable Current Value Default Value Reboot
--------------------------------- --------------- ------------- ------
vol_max_rdback_sz 1073741824 134217728 N
# vxtune vol_max_rdback_sz=2147483648
The "vol_max_rdback_sz" value has to be increased further, as a result the RDBACK value is then increased to:
# vxtune vol_max_rdback_sz
Tunable Current Value Default Value Reboot
--------------------------------- --------------- ------------- ------
vol_max_rdback_sz 2147483648 134217728 N
# vrstat -M
Sun Aug 4 22:20:01 2013
Replicated Data Set prod_rvg:
Sun Aug 4 22:20:01 2013
Replicated Data Set datadg_rvg:
Memory-pool Statistics:
Host Pool DG Min Max In Allocated Max Waiting
Size Size Use Used
------ ---------- ---- ------ ------ ----- ---------- ----- -------
Barney WRSHIP - 1024 65536 0 1024 0 no
Barney RDBCK-prodg_rvg - 1024 1048576 660 1024 972 no
Barney NMCOM-prodg_rvg - 1024 262144 972 1024 1012 no
Barney RDBCK-datadg_rvg - 1024 2097152 1194960 1222456 1198080 no <<<< NOT WAITING
Barney NMCOM-datadg_rvg - 1024 262144 262140 262140 262140 no
After increasing the RDBACK pool to 2147483648, the issue appears to be resolved as the rlinks no longer go into a stalled state.
Additionally o n the Primary the "-e" argument can be used in connection with the vxrlink command to gather more results about the replication throughput.
# vxrlink –g
Sample output:
# vxrlink -g datadg -i 10 -e stats rlk_adc
Fri Aug 2 22:24:25 2013
Messages :
--------
Number of blocks sent : 2834216
Compressed msgs : 25447
Compressed data(bytes) : 44041954
Uncompressed data(bytes) : 270866944
Compression Ratio : 6.15
Bandwidth Savings : 83.74%
Errors :
------
No memory available : 0
No message slots available : 0
No memory available in nmcom pool on Secondary : 0
Timeout : 11905
Missing packet : 668
Missing message : 154
Stream : 0
Checksum : 0
Unable to deliver due to transaction : 5
Messages :
--------
Number of blocks sent : 47701
Compressed msgs : 52
Compressed data(bytes) : 343355
Uncompressed data(bytes) : 1148416
Compression Ratio : 3.34
Bandwidth Savings : 70.10%
Errors :
------
No memory available : 0
No message slots available : 0
No memory available in nmcom pool on Secondary : 0
Timeout : 143
Missing packet : 8
Missing message : 0
Stream : 0
Checksum : 0
Unable to deliver due to transaction : 0