vxconfigd daemon appears to be hung while waiting for powermt command to complete

book

Article ID: 100007265

calendar_today

Updated On:

Description

Error Message

We can check the process tree of vxconfigd daemon to find if it is waiting for any child processes.
In Solaris systems we can use below procedure to check the same.

1. Get the process id of vxconfigd
# ps -ef|grep vxconfigd
    root    53     1   0   Oct 20 ?         120:39 vxconfigd -x syslog -m boot

2. Check the child processes of the vxconfigd process

# ptree 53
53 vxconfigd -k -m enable -x syslog
   19079 sh -c /etc/powermt display unmanaged > /dev/null 2>&1; echo $?
     19080 /etc/powermt display unmanaged


If a system dump is taken assuming vxconfigd is hung, we can check the same from core file.

SolarisCAT(vmcore.0/10U)> thread 0x30015cf2160
==== user (LWP_SYS) thread: 0x30015cf2160  PID: 19064 ====
cmd: vxconfigd -k -m enable -x syslog
t_wchan: 0x309b36287fc  sobj: condition var (from fifofs:fifo_read+0x90) 
t_procp: 0x309b6084c60
 p_as: 0x3001e3339e0  size: 76169216  RSS: 10813440
 hat: 0x3001b160080
   cnum: CPU0:16/3999 CPU1:5/7948 CPU2:3/2526 CPU4:2/4976 CPU5:2/6092 CPU6:2/5745 CPU7:3/1956 CPU8:22/7813 CPU9:5/6217 CPU10:3/1981 CPU11:2/4798 CPU12:2/3834 CPU13:2/2998 CPU14:2/3603 CPU15:2/7443 
   cpusran: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
 zone: global
t_stk: 0x2a102257ae0  sp: 0x2a102256f51  t_stkbase: 0x2a102252000
t_pri: 59(TS)  t_tid: 1  pctcpu: 0.000000
t_lwp: 0x309b21fef68  machpcb: 0x2a102257ae0
 mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
 ms_state_start: 16 minutes 48.4260390 seconds earlier
 ms_start: 17 minutes 6.7036773 seconds earlier
psrset: 0  last CPU: 4   
idle: 100842 ticks (16 minutes 48.42 seconds)  <==== Idle for most of the time
start: Thu Nov 17 17:29:28 2011
age: 1027 seconds (17 minutes 7 seconds)      
syscall: #3 read(, 0xffbfda30) (sysent: genunix:read32+0x0)
tstate: TS_SLEEP - awaiting an event
tflg:   T_WAKEABLE - thread is blocked, signals enabled
       T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
       TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
pflag:  SNOWAIT - children never become zombies
       SMSACCT - process is keeping micro-state accounting
       SMSFORK - child inherits micro-state accounting

pc:      genunix:cv_wait_sig_swap_core+0x130:   call    unix:swtch

genunix:cv_wait_sig_swap_core+0x130(, , 0x0)
genunix:cv_wait_sig_swap(0x309b36287fc, 0x309b36287a0, 0x1, 0x1, 0x3) - frame recycled
fifofs:fifo_read+0x90(0x30deacd4680, 0x2a102257a10, 0x0, 0x300f3c55b58, 0x0)   ,===
genunix:fop_read+0x20(0x30deacd4680, 0x2a102257a10, 0x0, , 0x0)
genunix:read+0x274(0x5)
unix:syscall_trap32+0xcc()
 

SolarisCAT(vmcore.0/10U)> proc tree 19064
1     /sbin/init -r
 19064 vxconfigd -k -m enable -x syslog
   19079 sh -c /etc/powermt display unmanaged > /dev/null 2>&1; echo $?
     19080 /etc/powermt display unmanaged     

 

Cause

When vxconfigd detects system is using powerpath, it tries to get the list of LUN's not managed by powerpath for DMP to manage it, by using
# powermt display unmanaged command. When a system has more number of LUN's or has any I/O error's, the powermt command may take long
time to complete. This delays the vxconfigd startup or restart, which appears to be hang.
 

Resolution

Solution for this is to find the cause of delay in powermt command to complete and fixing it.
We can run powermt command manually, outside vxconfigd, to check the time it takes to complete.
If it take more than expected time or any I/O errors are reported during this time, check storage or powerpath.


Applies To

Storage Foundation systems with EMC powerpath.

Issue/Introduction

On systems with EMC Powerpath, vxconfigd daemon, on system startup or restart, executes "powermt display unmanaged" command to obtain
list of devices not managed by Powerpath. A hung or delayed powermt command causes vxconfigd daemon and other VxVM commands to wait
for it to complete.