'hastatus -sum' output shows the CFSMount agent in failed state on InfoScale 8.0/rhel8 cluster node

book

Article ID: 100056191

calendar_today

Updated On:

Description

Error Message

/var/VRTSvcs/log/engine_A.log shows:

2023/06/07 03:23:06 VCS NOTICE V-16-1-10438 Group cvm has been probed on system serverB
2023/06/07 03:23:06 VCS NOTICE V-16-1-10442 Initiating auto-start online of group cvm on system serverB
2023/06/07 03:23:06 VCS NOTICE V-16-1-10301 Initiating Online of Resource cvm_clus (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:23:26 VCS INFO V-16-1-10298 Resource cvm_clus (Owner: Unspecified, Group: cvm) is online on serverB (VCS initiated)
2023/06/07 03:23:26 VCS NOTICE V-16-1-10301 Initiating Online of Resource vxfsckd (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:25:20 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun  7 03:23:03 2023
2023/06/07 03:25:20 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:25:20 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun  7 03:25:20 2023
2023/06/07 03:25:26 VCS ERROR V-16-2-13066 (serverB) Agent is calling clean for resource(vxfsckd) because the resource is not up even after online completed.
2023/06/07 03:25:26 VCS INFO V-16-2-13068 (serverB) Resource(vxfsckd) - clean completed successfully.
2023/06/07 03:25:26 VCS INFO V-16-2-13071 (serverB) Resource(vxfsckd): reached OnlineRetryLimit(0).
2023/06/07 03:25:26 VCS ERROR V-16-1-54031 Resource vxfsckd (Owner: Unspecified, Group: cvm) is FAULTED on sys serverB
2023/06/07 03:25:26 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvm_clus (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:25:27 VCS INFO V-16-1-10305 Resource cvm_clus (Owner: Unspecified, Group: cvm) is offline on serverB (VCS initiated)
2023/06/07 03:25:27 VCS INFO V-16-1-50158 VCS shall not initiate failover for service group cvm as AutoFailOver=0
2023/06/07 03:25:27 VCS ERROR V-16-1-10205 Group cvm is faulted on system serverB
2023/06/07 03:25:27 VCS NOTICE V-16-1-10446 Group cvm is offline on system serverB
2023/06/07 03:27:32 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun  7 03:25:20 2023
2023/06/07 03:27:32 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:27:32 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun  7 03:27:32 2023
2023/06/07 03:29:44 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun  7 03:27:32 2023
2023/06/07 03:29:44 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:29:44 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun  7 03:29:44 2023
...
2023/06/07 03:36:20 VCS ERROR V-16-1-10009 Agent CFSMount has faulted 6 times in less than 950 seconds -- Will not attempt to restart. Correct the problem and use haagent -start to start the agent

/var/VRTSvcs/log/CFSMount_A.log shows:

2023/06/07 03:25:20 VCS ERROR V-16-0-1018 CFSMount:fail to open vx_portal dev
2023/06/07 03:27:32 VCS INFO V-16-1-10196 Cluster logger started

Cause

The CFSMount Agent faults as not all the required InfoScale modules are loaded. In this instance the logs were complaining about vxportal. Checking the 'lsmod' output it could be seen that the vxportal module was not loaded.

 

# lsmod | grep -i vx
vxfen                 389979  2
vxgms                 288499  0
vxglm                 336418  0
vxspec                 13261  14
vxio                 5627578  2 vxspec
vxdmp                 539315  27
veki                   19214  11 amf,gab,llt,vxio,dmpaa,vxdmp,vxfen,vxglm,vxgms,vxspec

 

 

Resolution

The following command can be used to ensure all modules are loaded:

# /opt/VRTS/install/installer -start

 

Issue/Introduction

'hastatus -sum' output shows the CFSMount agent in failed state on InfoScale 8.0/rhel8 cluster node # hastatus -sum -- SYSTEM STATE
-- System State Frozen
A serverA RUNNING 0
A serverB RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B cvm serverA Y N ONLINE
B cvm serverB Y N OFFLINE|FAULTED
B testsg serverA Y N OFFLINE
B testsg serverB N N OFFLINE
-- RESOURCES FAILED
-- Group Type Resource System
D cvm CFSfsckd vxfsckd serverB -- RESOURCES NOT PROBED
-- Group Type Resource System
E testsg CFSMount cfsmount1 serverB -- AGENTS FAILED
-- Type System
J CFSMount serverB

Error Message

/var/VRTSvcs/log/engine_A.log shows: 2023/06/07 03:23:06 VCS NOTICE V-16-1-10438 Group cvm has been probed on system serverB
2023/06/07 03:23:06 VCS NOTICE V-16-1-10442 Initiating auto-start online of group cvm on system serverB
2023/06/07 03:23:06 VCS NOTICE V-16-1-10301 Initiating Online of Resource cvm_clus (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:23:26 VCS INFO V-16-1-10298 Resource cvm_clus (Owner: Unspecified, Group: cvm) is online on serverB (VCS initiated)
2023/06/07 03:23:26 VCS NOTICE V-16-1-10301 Initiating Online of Resource vxfsckd (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:25:20 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun 7 03:23:03 2023
2023/06/07 03:25:20 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:25:20 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun 7 03:25:20 2023
2023/06/07 03:25:26 VCS ERROR V-16-2-13066 (serverB) Agent is calling clean for resource(vxfsckd) because the resource is not up even after online completed.
2023/06/07 03:25:26 VCS INFO V-16-2-13068 (serverB) Resource(vxfsckd) - clean completed successfully.
2023/06/07 03:25:26 VCS INFO V-16-2-13071 (serverB) Resource(vxfsckd): reached OnlineRetryLimit(0).
2023/06/07 03:25:26 VCS ERROR V-16-1-54031 Resource vxfsckd (Owner: Unspecified, Group: cvm) is FAULTED on sys serverB
2023/06/07 03:25:26 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvm_clus (Owner: Unspecified, Group: cvm) on System serverB
2023/06/07 03:25:27 VCS INFO V-16-1-10305 Resource cvm_clus (Owner: Unspecified, Group: cvm) is offline on serverB (VCS initiated)
2023/06/07 03:25:27 VCS INFO V-16-1-50158 VCS shall not initiate failover for service group cvm as AutoFailOver=0
2023/06/07 03:25:27 VCS ERROR V-16-1-10205 Group cvm is faulted on system serverB
2023/06/07 03:25:27 VCS NOTICE V-16-1-10446 Group cvm is offline on system serverB
2023/06/07 03:27:32 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun 7 03:25:20 2023
2023/06/07 03:27:32 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:27:32 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun 7 03:27:32 2023
2023/06/07 03:29:44 VCS WARNING V-16-1-10023 Agent CFSMount not sending alive messages since Wed Jun 7 03:27:32 2023
2023/06/07 03:29:44 VCS WARNING V-16-1-53025 Agent CFSMount has faulted; ipm connection was lost; restarting the agent
2023/06/07 03:29:44 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CFSMount/CFSMountAgent for resource type CFSMount successfully started at Wed Jun 7 03:29:44 2023
...
2023/06/07 03:36:20 VCS ERROR V-16-1-10009 Agent CFSMount has faulted 6 times in less than 950 seconds -- Will not attempt to restart. Correct the problem and use haagent -start to start the agent
/var/VRTSvcs/log/CFSMount_A.log shows: