Rlink remains as "inconsistent cant_sync" after the SRL is completely drained and replication is started with a checkpoint

book

Article ID: 100021371

calendar_today

Updated On:

Description

Error Message

vxprint -lP
flags: write enabled attached inconsistent cant_sync connected asynchronous

 

Resolution

Due to Etrack 1728587 the rlink may remain as "inconsistent cant_sync" after SRL is completely drained when replication is started with a checkpoint. The SRL is completely drained when the rlink becomes up-to-date. The problem may occur if the rlink is disconnected before the SRL is completely drained.  The rlink can be disconnected due to a network outage, running "vxrlink det" to detach the secondary link or running"ifconfig down" to bring down the network interface.

The following is an example of the problem.

Rlink is started with a checkpoint.

#vradmin -g "diskgroup" -c "checkpoint" startrep "rvg"

After the SRL is completely drained, the rlink status becomes up-to-date.

# vxrlink -g"diskgroup" status "rlink_name"
VxVM VVR vxrlink INFO V-5-1-4467 Rlinkrlk_name is up to date

The rlink is still marked with the"inconsistent cant_sync" flags.

# vxprint -g vvr1dg -lPrlk_alaw2bge3_rvg1
Rlink:    rlk_alaw2bge3_rvg1
info:    timeout=500 packet_size=8400 rid=0.1877
       latency_high_mark=10000 latency_low_mark=9950
       bandwidth_limit=none
state:    state=ACTIVE
       synchronous=off latencyprot=offsrlprot=autodcm
assoc:    rvg=rvg1
       remote_host=alaw2bge3 IP_addr=192.168.33.2port=4145
       remote_dg=vvr2dg
       remote_dg_dgid=1166620333.99.alaw2
       remote_rvg_version=30
       remote_rlink=rlk_alaw1bge3_rvg1
       remote_rlink_rid=0.2017
        local_host=alaw1bge3IP_addr=192.168.33.1 port=4145
protocol: UDP/IP
checkpoint:ckptA
flags: write enabled attached inconsistent cant_sync connectedasynchronous     <<< inconsistent cant_sync



The following procedure can be used to fix the rlink flags.

1. Make sure that the data volumes are actually synchronized between the VVR Primary and VVRSecondary. The inconsistency of the rlink may be caused by something else other than the above mentioned etrack, It is important to verify if the data volumes are actually synchronized before performing the subsequent procedure.   The data volumes in the RVG can be checked if they are synchronized by using the vradmin verifydata option.   The vradmin verifydata option will create Space Optimized Snapshots of the data volumes for the verification.  Please refer to the VVR Administrator's Guide for details on how to use the vradmin verifydata option.

2. Create a checkpoint

# vxrvg -g vvr1dg -cckptA checkstart rvg1

Initially, the amount of data in the checkpoint is smaller than that in the SRL.

# vxrvg -g vvr1dg cplistrvg1            
Name                                    MBytes                %Log    Started/Completed
----                                        ------      ------      -----------------
ckptA                  0              0              Started                      
     <<< checkpoint has 0MB

VxVM VVR vxrlink INFO V-5-1-4640 Rlinkrlk_alaw2bge3_rvg1 has 10 outstanding writes, occupying 15 Kbytes (0%) on theSRL    << SRL has 15KB

3. Then we wait until the checkpoint contains more data than the SRL.

alaw1# vxrvg -g vvr1dg cplistrvg1
Name                                    MBytes                %Log    Started/Completed
----                                        ------      ------      -----------------
ckptA                  2              1              Started                        
 <<
VxVM VVR vxrlink INFO V-5-1-4640 Rlinkrlk_alaw2bge3_rvg1 has 6 outstanding writes, occupying 16 Kbytes (0%) on theSRL   <<< SRL has 16KB

3. When the checkpoint contains more than than the SRL, then we run a checkend.

# vxrvg -g vvr1dg checkendrvg1        

# vxrvg -g vvr1dg cplistrvg1

Name                                    MBytes                %Log    Started/Completed
----                                        ------      ------      -----------------
ckptA                  4              2              Completed                  
 <<
VxVM VVR vxrlink INFO V-5-1-4640 Rlinkrlk_alaw2bge3_rvg1 has 42 outstanding writes, occupying 2277 Kbytes (1%) on theSRL   << SRL has 2MB

4. We can now use the checkpoint to restart the replication to get the rlink in a consistent state.

# vradmin-g vvr1dg -f stoprep rvg1

# vradmin -g vvr1dg -c ckptA startreprvg1


5. After the SRL is drained, the rlink will become consistent provided that there is no further interruption to the replication.

# vxprint -g vvr1dg -lPrlk_alaw2bge3_rvg1
Rlink:    rlk_alaw2bge3_rvg1
info:    timeout=500 packet_size=8400 rid=0.1877
       latency_high_mark=10000 latency_low_mark=9950
       bandwidth_limit=none
state:    state=ACTIVE
       synchronous=off latencyprot=offsrlprot=autodcm
assoc:    rvg=rvg1
       remote_host=alaw2bge3 IP_addr=192.168.33.2port=4145
       remote_dg=vvr2dg
       remote_dg_dgid=1166620333.99.alaw2
       remote_rvg_version=30
       remote_rlink=rlk_alaw1bge3_rvg1
       remote_rlink_rid=0.2017
        local_host=alaw1bge3IP_addr=192.168.33.1 port=4145
protocol: UDP/IP
checkpoint:ckptA
flags:    write enabled attached consistent connected asynchronous


The fix is available in 5.0MP3RP2 HF5.  Please contact Veritas Support if you need the hotfix.   The Etrack incident will be fixed in further official VxVM patch release.

 
 

 

Issue/Introduction

Rlink remains as "inconsistent cant_sync" after SRL is completely drained (rlink becomes up to date) when replication is started with a checkpoint