vxdclid causes vxconfigd memory leak

book

Article ID: 100005255

calendar_today

Updated On:

Description

Error Message

The memory of vxconfigd grows from:

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
10065 root      18   0  641m  15m 2724 R 82.4  0.4   0:10.22 vxconfigd
10065 root      18   0  643m  17m 2724 R 83.3  0.4   0:12.73 vxconfigd
10065 root      17   0 73016  15m 2724 S 88.6  0.4   0:15.40 vxconfigd
...............

to:
10065 root      25   0  780m 153m 2800 R 86.1  3.9  78:11.60 vxconfigd
10065 root      16   0  790m 153m 2800 S 87.3  3.9  78:14.23 vxconfigd
10065 root      18   0  780m 153m 2800 R 88.9  3.9  78:16.91 vxconfigd
..............

Cause

vxdclid is slow in processing the events but the instrumented code for idle lun probing is generating tons of events, so events get pile up in vxdclid client structure, if stop generating event it slowly clear all the events and free up the memory. It is not leaking any memory, only thing is processing of events is slow which is causing vxconfigd memory to grow. In customer case event were continuously coming that is why memory usage never decreased.

From dmpevents log we can see that the paths are under EMC Powerpath control. So the node will not be marked idle but a node_idle event will be generated for all idle paths under TPD metanodes every second. We should either return something from the function dmp_set_node_idle() and call the event only on success or add the if condition to check if it elongs to a TPD metanode at the beginning of dmp_check_node_idle() and return from there if it is so. This should fix the customer's problem.

Resolution

This problem is a cross-platform known issue, and will be fixed in SF5.1RP2.


Applies To

SF5.1 Solaris 10

Reproduce steps:
1. setup vxdclid by /opt/VRTSsfmh/adm/dclisetup.sh
2. connect a EMC diskarray with powerpath (any model is ok)
3. make a diskagroup
4. wait 7~8 hours, and you can notice vxconfigd memory usage is grow up to a unreasonable amount.
   (for different platform use different tools to check memory usage for vxconfigd)

 

Issue/Introduction

When "vxdclid" daemon running , and there're emc array connected with powerpath , the vxconfigd will show memory leak.
In lab, the problem has been reproduced. The memory of vxconfigd grows from 28MB to 105MB during one night.
On customer's site, it can grow up to 3GB during several hours because there are more EMC disks than lab.