RLINKs shown as ENABLED ACTIVE in Cluster Volume Replicator (CVR) environment and fail to replicate where the RVGLogowner is not running on the CVM Master

book

Article ID: 100048037

calendar_today

Updated On:

Description

Error Message


The following vradmin repstatus output implies there is a network connection issue between the Primary and Secondary.
 

# vradmin -g -l repstatus

Replicated Data Set: appdg_rvg
Primary:
 Host name: to-barney-replink
 RVG name: appdg_rvg
 DG name: appdg

 RVG state: enabled for I/O
 Data volumes: 1
 VSets: 0
 SRL name: vxvol_srl_fred
 SRL size: 29.97 G

 Total secondaries: 1

Secondary:

 Host name: to-fred-replink
 RVG name: appdg_rvg
 DG name: appdg
 Data status: consistent, behind
 Replication status: paused due to network disconnection
 Current mode: asynchronous
 Logging to: SRL ( 194816 Kbytes behind, 0 % full )
 Timestamp Information: behind by 14h 42m 24s


Another RVG may suggest the vradmind is not running:
 

# vradmin -g -l repstatus

Replicated Data Set: datadg_rvg
Primary:
 Host name: to-barney-replink
 RVG name: datadg_rvg
 DG name: datagd
 RVG state: enabled for I/O
 Data volumes: 1 VSets: 0
 SRL name: vxvol_arc_fred
 SRL size: 34.00 G
 Total secondaries: 1
 Config Errors:
to-barney-replink: Pri or Sec IP not available or vradmind not running

 

Cause


The CVM master node needs to assume the logowner role for VCS managed VVR resources.

The RVGLogowner service group is a failover group, which manages the VVR logowner role.

If the RVGlogowner resource is not onlined on the CVM Master node, this can result in unexpected replication related issues where the CVM slave node incorrectly assumes the logowner role.

This results in the RLINK being unable to form a connection between the Primary and Secondary RVGs, preventing replication from happening as designed.

 

Resolution


To ensure replication can occur correctly between the Primary and Secondary RVGs, the RVGLogownerGrp must be onlined on the CVM Master at the Primary RVG site.

The RVGLogowner resource represents the logowner for RVG in the cluster. The RVGLogowner resource is dependent on the IP resource that it uses for replication.





When configuring parallel applications, use the RVGShared and RVGLogowner agents.

If a disk group is configured as a CVMVolDG resource, then all the RVGs must be configured as RVGShared resources.

NOTE: Do not configure the RVGShared resource in the cvm service group.

Configure the RVGShared resource in a separate group which contains the RVGShared resource and the CVMVolDg resource.

The service group RVGShared manages the RVG used by the application.
 


Sample main.cf
 

group RVGLogownerGrp (
        SystemList = { fred = 0, barney = 1 }
        AutoStartList = { fred, barney }
        )

        IP vvr_ip (
                Device = bond2
                Address = "10.216.144.199"
                )

        NIC vvr_nic (
                Device =bond2
                )

        RVGLogowner vvr_rvglogowner (
                RVG = app_rvg
                DiskGroup = appdg
                )

        requires group RVGSharedGrp online local firm
        vvr_ip requires vvr_nic
        vvr_rvglogowner requires vvr_ip


        // resource dependency tree
        //
        //      group RVGLogownerGrp
        //      {
        //      RVGLogowner vvr_rvglogowner
        //          {
        //          IP vvr_ip
        //              {
        //              NIC vvr_nic
        //              }
        //          }
        //      }


group RVGSharedGrp (
        SystemList = { fred = 0, barney = 1 }
        Parallel = 1
        AutoStartList = { fred, barney }
        )

        CVMVolDg shareddg (
                CVMDiskGroup = appdg
                CVMActivation = sw
                )

        RVGShared vvr_sharedrvg (
                RVG = app_rvg
                DiskGroup = appdg
                )

        requires group cvm online local firm
        vvr_sharedrvg requires shareddg


        // resource dependency tree
        //
        //      group RVGSharedGrp
        //      {
        //      RVGShared vvr_sharedrvg
        //          {
        //          CVMVolDg shareddg
        //          }
        //      }


group cvm (
        SystemList = { fred = 0, barney = 1 }
        AutoFailOver = 0
        Parallel = 1
        AutoStartList = { fred, barney }
        )

        CVMCluster cvm_clus (
                CVMClustName = vcs_clust2
                CVMNodeId = { fred = 0, barney = 1 }
                CVMTransport = gab
                CVMTimeout = 200
                )

        CVMVxconfigd cvm_vxconfigd (
                Critical = 0
                CVMVxconfigdArgs = { syslog }
                )

        cvm_clus requires cvm_vxconfigd


        // resource dependency tree
        //
        //      group cvm
        //      {
        //      CVMCluster cvm_clus
        //          {
        //          CVMVxconfigd cvm_vxconfigd
        //          }
        //      }


VCS Preonline Trigger


To ensure the RVGLogowner is onlined on the CVM Master node, a VCS Preonline trigger can be implemented.

These procedures ensure that the CVM master node always assumes the logowner role. Not performing these procedures can result in unexpected issues that are due to a CVM slave node that assumes the logowner role.

For a service group that has an RVGLogowner resource, change the value of its TriggersEnabled attribute to PREONLINE to enable it.
 

1. To enable the preonline_vvr trigger, do one of the following:
 

  • Check the /opt/VRTSvcs/bin/triggers/preonline directory contents for any existing preonline trigger scripts prior to copying.

    If preonline trigger script is not already present, copy the preonline trigger script from the sample triggers directory into the triggers directory:
     

    # cp /opt/VRTSvcs/bin/sample_triggers/VRTSvcs/preonline_vvr /opt/VRTSvcs/bin/triggers/preonline
     

    Change the file permissions to make it executable.

  •  Remember to copy the preonline trigger into place on each of the nodes in the clusters.
     

  • If the preonline trigger script is already present, create a directory such as /preonline and move the existing preonline trigger as T0preonline to that directory.

    Copy the preonline_vvr trigger as T1preonline to the same directory.

  • If you already use multiple triggers, copy the preonline_vvr trigger as TNpreonline, where TN is the next higher TNumber.


2. To enable the TriggersEnabled attribute from the command line on a service group that has an RVGLogowner resource

  • From any node in the cluster which hosts the Primary RVG, perform the following commands:

          # haconf -makerw

          # hagrp -modify RVGLogownerGrp TriggersEnabled PREONLINE

     #haconf -dump -makero


Where RVGLogownerGrp is the service group that contains the RVGLogowner resource.

 

Verify the syntax of the file /etc/VRTSvcs/conf/config/main.cf:

# hacf -verify /etc/VRTSvcs/conf/config



Repeat the above steps on all the nodes forming the Secondary RVG cluster.



Restarting VVR daemons
 

There are three VVR daemons that are started during the system startup. These are vradmind, vxrsynchd and vxnetd.
 

vradmind – The VVR configuration daemon (think of it as vxconfigd for VVR).
                    A user land process for VVR to manage the replication configuration and replication control.
                    # ps –ef | grep vradmind

vxnetd       – The vxnetd daemon provides communication between the rlinks of the VVR primary and secondary.
                    Is responsible for sending heartbeats and data.
                    It is in the form of a kernel thread.
                    # netstat –na | grep 4145

vxrsyncd   – Used during differences based and full synchronisation operations .
                    A user land process.
                    # ps –ef | grep vxrsyncd

 

These daemons need to be restarted on all clustered nodes following the above changes.
 

# /usr/sbin/vxstart_vvr stop

# /usr/sbin/vxstart_vvr start


 

Replication should now start working, if not trying the following two commands:

# vradmin -g pauserep

# vradmin -g resumerep


If replication still fails to function as expected, contact Veritas Technical Support for further guidance.

 

 

Issue/Introduction


Veritas Volume Replicator (VVR) is unable to replicate data between the Primary (source) and a Secondary (target) RVGs (Replicated Volume Groups).

A Replication Link (RLINK) is associated with an RVG and establishes a replication connection between the Primary and Secondary RVGs.

In this instance, the RLINK is unable to transition into the required CONNECT ACTIVE state for replication to occur. Instead the RLINK state reflects the ENABLED ACTIVE state, so data is not being replication between the Primary and Secondary RVGs. The environment consists of VxVM Shared Disks Groups forming a Cluster Volume Replicator (CVR) environment.
.


Error Message


The following vradmin repstatus output implies there is a network connection issue between the Primary and Secondary.
# vradmin -g <dg-name> -l repstatus <RVG-name> Replicated Data Set: appdg_rvg
Primary:
Host name: to-barney-replink
RVG name: appdg_rvg
DG name: appdg

RVG state: enabled for I/O
Data volumes: 1
VSets: 0
SRL name: vxvol_srl_fred
SRL size: 29.97 G

Total secondaries: 1

Secondary:

Host name: to-fred-replink
RVG name: appdg_rvg
DG name: appdg
Data status: consistent, behind
Replication status: paused due to network disconnection
Current mode: asynchronous
Logging to: SRL ( 194816 Kbytes behind, 0 % full )
Timestamp Information: behind by 14h 42m 24s