How to validate and confirm the memory leak in the vxconfigd process on Solaris has been addressed
This procedure uses the "libumem" feature of the Solaris Operating System, the feature was first made available in Solaris 9 Update 3, hence this approach can not be employed on a Solaris server prior to this version.
libumem is a library, developed by Sun, that is used to track memory allocations of userland processes and as such, can be used to help identify memory leaks.
The process requires that the VxVM daemon vxconfigd be stopped and restarted in order for the libumem feature to be used.
Ensure that any applications that make use of vxconfigd will not be disrupted whilst the below procedure is carried out.
For example, it is recommended to freeze all related "Service Groups" if Veritas Cluster Server (VCS) is in use.
Check that vxconfigd is not carrying out any tasks (with vxtask list) prior to implementing this procedure.
1.] Stop the VxVM daemon "vxconfigd".
# vxdctl stop
# vxdctl mode
mode: not-running
2.] Make a copy of the current vxconfigd binary, and rename it as shown below.
# cd /sbin
# ls -al vxconfigd*
-r-xr-xr-x 1 root sys 5594248 Apr 5 20:05 vxconfigd
-r-xr-xr-x 1 root sys 5808760 Jan 14 06:10 vxconfigd.orig
-r-xr-xr-x 1 root sys 5594248 Apr 5 20:05 vxconfigd.SunOS_5.10
-r-xr-xr-x 1 root sys 7246072 Apr 5 20:28 vxconfigd.SunOS_5.9
# cp vxconfigd vxconfigd.version # plesae replace "version" with the current VxVM number
# mv vxconfigd vxconfigd.orig
3.] Create the libumem wrapper script.
# vi vxconfigd
- Add the following line entry to the file:
UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1 /sbin/vxconfigd.orig $*
- Write and quit the file.
# cat vxconfigd
UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1 /sbin/vxconfigd.orig $*
# chmod 555 vxconfigd
# chown root:sys vxconfigd
4.] Restart the VXVM process using the vxconfigd wrapper script, which calls the renamed vxconfigd.orig binary.
# vxdctl stop
# vxdctl mode
mode: not-running
# /sbin/vxconfigd -x syslog >/dev/null 2>&1
# vxdctl mode
mode: enabled
5.] Confirm the new vxconfigd process is running with the *.orig extension.
# ps -ef | grep vxconfigd
root 7601 5489 0 15:03:17 pts/1 0:00 grep vxconfigd
root 7305 1 0 14:58:51 ? 0:04 /sbin/vxconfigd.orig
# pgrep vxconfigd
7305
6.] Once the problem has been reproduced, ensure the memory leak has occurred (the larger the leak the better).
The Solaris command 'pmap' can be used to monitor memory usage for the given process, example shown below:
# pmap -x `pgrep vxconfigd`
Example:
# gcore 7305
What should not happen
Create a wrapper script to help trigger the potential memory leak.
# more loop.sh
while true
do
vxdmpadm exclude vxvm dmpnodename=emcpower26s2
date
pmap -x `pgrep vxconfigd` | egrep '(heap|total)'
vxdmpadm include vxvm dmpnodename=emcpower26s2
done
In this instance, the wrapper script "loop.sh", excludes and includes the same EMC TPD device "emcpower26s2" in a loop.
The date, time and related pmap output is captured to see if the memory usage is increasing.
# ./loop.sh
Wednesday, 6 April 2011 15:07:08 BST
003DC000 15280 15280 15272 - rwx-- [ heap ]
012C8000 72 72 72 - rwx-- [ heap ]
012DA000 4736 3432 3432 - rwx-- [ heap ]
0177A000 72 72 72 - rwx-- [ heap ]
0178C000 72 72 72 - rwx-- [ heap ]
0179E000 144 144 144 - rwx-- [ heap ]
017C2000 144 144 144 - rwx-- [ heap ]
017E6000 72 72 72 - rwx-- [ heap ]
total Kb 30432 28880 21168 -
Wednesday, 6 April 2011 15:07:22 BST
003DC000 15280 15280 15272 - rwx-- [ heap ]
012C8000 72 72 72 - rwx-- [ heap ]
012DA000 4736 3432 3432 - rwx-- [ heap ]
0177A000 72 72 72 - rwx-- [ heap ]
0178C000 72 72 72 - rwx-- [ heap ]
0179E000 144 144 144 - rwx-- [ heap ]
017C2000 144 144 144 - rwx-- [ heap ]
017E6000 2880 1752 1752 - rwx-- [ heap ]
01AB6000 144 144 144 - rwx-- [ heap ]
01ADA000 144 144 144 - rwx-- [ heap ]
01AFE000 144 136 136 - rwx-- [ heap ]
total Kb 33656 30968 23256 -
Wednesday, 6 April 2011 15:07:36 BST
pmap: cannot examine 7305: address space is changing
Wednesday, 6 April 2011 15:07:49 BST
003DC000 15280 15280 15272 - rwx-- [ heap ]
012C8000 72 72 72 - rwx-- [ heap ]
012DA000 4736 3432 3432 - rwx-- [ heap ]
0177A000 72 72 72 - rwx-- [ heap ]
0178C000 72 72 72 - rwx-- [ heap ]
0179E000 144 144 144 - rwx-- [ heap ]
017C2000 144 144 144 - rwx-- [ heap ]
017E6000 2880 1752 1752 - rwx-- [ heap ]
01AB6000 144 144 144 - rwx-- [ heap ]
01ADA000 144 144 144 - rwx-- [ heap ]
01AFE000 3048 1968 1968 - rwx-- [ heap ]
01DF8000 72 72 72 - rwx-- [ heap ]
01E0A000 72 72 72 - rwx-- [ heap ]
01E1C000 288 288 288 - rwx-- [ heap ]
01E64000 72 72 72 - rwx-- [ heap ]
01E76000 2808 1696 1696 - rwx-- [ heap ]
02134000 72 72 72 - rwx-- [ heap ]
02146000 72 72 72 - rwx-- [ heap ]
02158000 216 216 216 - rwx-- [ heap ]
0218E000 144 144 144 - rwx-- [ heap ]
total Kb 40344 35472 27760 -
Wednesday, 6 April 2011 15:08:03 BST
003DC000 15280 15280 15272 - rwx-- [ heap ]
012C8000 72 72 72 - rwx-- [ heap ]
012DA000 4736 3432 3432 - rwx-- [ heap ]
0177A000 72 72 72 - rwx-- [ heap ]
0178C000 72 72 72 - rwx-- [ heap ]
0179E000 144 144 144 - rwx-- [ heap ]
017C2000 144 144 144 - rwx-- [ heap ]
017E6000 2880 1752 1752 - rwx-- [ heap ]
01AB6000 144 144 144 - rwx-- [ heap ]
01ADA000 144 144 144 - rwx-- [ heap ]
01AFE000 3048 1968 1968 - rwx-- [ heap ]
01DF8000 72 72 72 - rwx-- [ heap ]
01E0A000 72 72 72 - rwx-- [ heap ]
01E1C000 288 288 288 - rwx-- [ heap ]
01E64000 72 72 72 - rwx-- [ heap ]
01E76000 2808 1696 1696 - rwx-- [ heap ]
02134000 72 72 72 - rwx-- [ heap ]
02146000 72 72 72 - rwx-- [ heap ]
02158000 216 216 216 - rwx-- [ heap ]
0218E000 3456 2360 2360 - rwx-- [ heap ]
total Kb 43688 37720 30008 -
S.D.R.F ( Symptom / Description / Resolution / Feature ) Content:
Incident no::2346469 Tracking ID ::2346470
Symptom:
Executing "vxdmpadm exclude vxvm dmpnodename=
"vxdmpadm exclude vxvm dmpnodename=
Similarly, "vxdmpadm include vxvm dmpnodename=
1) While placing a device under VxVM's control, a chunk of memory is allocated by vxconfigd to store VxVM specific meta-data specific to the device.
Because of a defect, this chunk of memory is not freed while removing the device from VxVM's control using "vxdmpadm exclude".
2) While removing a device from VxVM's control, a chunk of memory is allocated by vxconfigd as scratchpad to store details of the device being excluded from VxVM's control.
This chunk of memory is used as temporary storage during the execution of "vxdmpadm exclude".
Because of a defect, this scratchpad memory is not freed at the end of execution of "vxdmpadm exclude"
Resolution:
The chunk of memory allocated during "vxdmpadm include" operation to store VxVM specific meta-data specific to the device is freed during "vxdmpadm exclude" operation.
The scratchpad memory allocated during "vxdmpadm exclude" to store details of the device being excluded from VxVM's control is freed at the end of "vxdmpadm exclude" operation.
The product defect is resolved with 5.1 SP1 RP2 ( Solaris Sparc).
The patch is available from the Veritas Operation Readiness Tools website.
https://sort.Veritas.com/patch/matrix
Applies To
In order for the fix to work, 5.1 SP1 RP2 (Solaris Sparc) must be applied.