The vxstat command option "-f r" can be used to monitor the snapshot resynchronization performance. When the reported "RESYNC WRITES Avg(ms)" is high, such as over 3000ms, the resynchronization performance will drop significantly.
When high I/O contention is happening on the system, the high resync latency is an expected behaviour. Following is a general description of the snapshot synchronization process in DCO version 20.
Synchronization of a Data Change Object (DCO) region involves following operations:
a. Read state of the region in DCO page of snapshot volume (may need IO, depends on the size of the VxVM Kernel Paging Module (volpagemod) Memory Size)
b. Read state of the region in DCO page of primary volume (may need IO, depends on the size of the VxVM Kernel Paging Module (volpagemod) Memory Size)
c. Read region data from the primary volume (needs I/O)
d. Write the region data into the snapshot (needs I/O). Sync task would read the data from the primary volume and then try to write on the snapshot. At this stage it would verify if pushed write (application write operation) already done (by referring map in the DCO) on snapshot and if not, write the data atomically.
e. Update the state of the region in DCO of snapshot volume (needs I/O)
Writes on a volume in general will contend for some VxVM kernel shared locks. Write on the same/overlapping region will contend for resources.
In case the application write operations are concentrated on the area where the snapshot operations are currently being performed, the slow snapshot performance (high RESYNC WRITES AVG(ms)) can be observed.
Please note that as generally the writes are executed randomly, normally this contention shouldn’t happen very frequently.
Tuning Proposals
- Increase the I/O size during snapshot synchronization
Veritas VxVM engineering team also analyzed the effect of synchronization I/O size on the synchronization performance. The test result shows that increasing the I/O size had substantial effect on improving the performance. By default the synchronization I/O size is 1MB. As a first step, the synchronization I/O size can be increased to 4MB and monitor the synchronization performance. The synchronization I/O size can be increased in step of 4MB and up to 16MB while monitoring the synchronization performance in order to find out optimal I/O size for your environment.
The synchronization I/O size can be specified by using the vxsnap command option "-o iosize=value". For example,
# vxsnap -g diskgroup_name -o iosize=4m refresh snapshot_volume_name sync=on
or
# vxsnap -g disksgroup_name refresh snapshot_volume_name sync=off
# vxsnap -g disksgroup_name -o iosize=4m syncstart snapshot_volume_name
- Adopt appropriate snapshot type according to service scenarios
If snapshot would be used only after full synchronization, better option would be to create the snapshot using break-off snapshot operations because the latter type will add snapshot mirror(s) to the volume and will not have contention issues as instant snapshot has.
For break-off type snapshot, it is suggested to upgrade to Veritas Storage Foundation 6.0 to take advantages of the DCO version 30. A sample output of DCO version 30 will be:
# vxprint -g diskgroup_name -m dco_name
dco dco_name
tutil0="
tutil1="
tutil2="
parent_vol=volisnap
log_vol=volisnap_dcl
comment="DCO for volisnap
rid=0.1095
putil0="
putil1="
putil2="
p_flag_move=off
badlog=off
parent_vol_rid=0.1055
log_vol_rid=0.1087
sp_num=1
version=30 <<<< DCO version 30
dcoregionsz=128
drlregionsz=128
drlmapsz=2048
drl=on
sequentialdrl=off
drllogging=on
snap=volfmr3x_snp
With the new DRL design and asynchronous writes introduced in DCO version 30, it is expected greater throughput can be achieved, especially for random writes on large volumes. Per internal performance statistics tests, there will be over 30% improvement for some i/o scenarios compared to previous version.
Please note that in DCO version 30 the size of the VxVM Page Module is not significant due to the new design of the DCO.
Additional Notes
Please note that when the snapshot performance issue occurs of DCO version 20, the first step is to check that VxVM Kernel Paging Module (volpagemod) Memory Size is big enough, and it should be increased to give VxVM kernel module enough memory to work to avoid accessing the on-disk DCO data. The volpagemod memory size is controlled by the VxVM tunable parameter volpagemod_max_memsz.
Please refer to the SymWISE article 000029655 on how to calculate and tune the required memory size.
Applies To
Veritas Volume Manager using DCO version 20 on all platforms