The errors below are observed in the LVMVolumeGroup_A log file.
2024/04/18 05:15:03 VCS ERROR V-16-10031-14006 LVMVolumeGroup:lvm_poc:clean:Deactivation of logical volumes failed.
2024/04/18 05:15:03 VCS ERROR V-16-10031-14007 LVMVolumeGroup:lvm_poc:clean:Deactivation of volume group failed.
2024/04/18 05:15:03 VCS INFO V-16-2-13716 Thread(139721341761280) Resource(lvm_poc): Output of the completed operation (clean)
==============================================
File descriptor 3 (pipe:[447486]) leaked on vgdisplay invocation. Parent PID 89613: /bin/sh
Volume group "vg00" not found
Cannot process volume group vg00
==============================================
2024/04/18 05:15:03 VCS DBG_FFDC Generating FFDC for resource (lvm_poc) as clean entry point failed
==============================================
2024/04/18 05:15:03 VCS DBG_AGDEBUG Calling clean for resource lvm_poc
VCSAgType.C:call_clean[1054]
2024/04/18 05:15:03 VCS DBG_AGDEBUG Calling exec_script without passing container information
VCSAgType.C:_exec_script[3391]
2024/04/18 05:15:03 VCS DBG_AGDEBUG Value of VCSAgResEPStruct is {ResName=lvm_poc, EpName=clean, EpEnum=2, ConfLevel=0, MonitorInfo=0}
VCSAgProcess.C:exec_script[1892]
2024/04/18 05:15:03 VCS DBG_AGDEBUG child pid is - 89557
VCSAgProcess.C:exec_script[2121]
2024/04/18 05:15:03 VCS DBG_5 LVMVolumeGroup:lvm_poc:clean:Stopping all volumes...
2024/04/18 05:15:03 VCS DBG_3 LVMVolumeGroup:lvm_poc:clean:Output of vgchange: File descriptor 3 (pipe:[447486]) leaked on vgchange invocation. Parent PID 89610: /bin/sh
Volume group "vg00" not found
Cannot process volume group vg00
2024/04/18 05:15:03 VCS DBG_5 LVMVolumeGroup:lvm_poc:clean:Verifying the volume group is imported or not...
2024/04/18 05:15:03 VCS DBG_5 LVMVolumeGroup:lvm_poc:clean:Deactivating volume group...
2024/04/18 05:15:03 VCS DBG_3 LVMVolumeGroup:lvm_poc:clean:Output of vgexport: File descriptor 3 (pipe:[447486]) leaked on vgexport invocation. Parent PID 89616: /bin/sh
Volume group "vg00" not found
Cannot process volume group vg00
2024/04/18 05:15:03 VCS DBG_AGDEBUG script (/opt/VRTSvcs/bin/LVMVolumeGroup/clean) exited with status (1)
VCSAgProcess.C:exec_script[2205]
2024/04/18 05:15:03 VCS DBG_AGINFO Resource(lvm_poc) - clean entry point exited with a return value 1.
VCSAgType.C:call_clean[1082]
2024/04/18 05:15:03 VCS DBG_AGINFO Resource(lvm_poc) - Clean failed.
VCSAgType.C:call_clean[1092]
2024/04/18 05:15:03 VCS DBG_AGDEBUG Canceling timer for (lvm_poc) op(1603)
VCSAgTimer.C:_cancel[953]
If all the paths to the disks are disabled, the LVM agent is unable to change the state until at least one path to the disks is restored. This is because the agent uses LVM2 commands. This behavior causes online and offline entry points of LVMVolumeGroup agent to timeout and the Clean EntryPoint (EP) stops responding for an indefinite time. Due to this, the service group cannot failover to another node.
A supported hotfix has been made available for this issue on top of Infoscale 8.0.2 Update 2 . Please contact Infoscale Technical Support to obtain this fix. This hotfix has not yet gone through any extensive Q&A testing. Consequently, if you are not adversely affected by this problem and have a satisfactory temporary workaround in place, we recommend that you wait for the public release of this hotfix.
InfoScale currently plans to address this issue by way of a patch or hotfix to the current and older version of the software. Please be sure to refer back to this document periodically as any changes to the status of the issue will be reflected here. A link to the patch or hotfix download will be added to this document when it becomes available. Please note that InfoScale reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests. InfosScale’ plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk.
The agent behavior is changed to manage the LVM VG in these situations.
This hotfix is to implement the PanicSystemOnVGLoss attribute. It is similar to the PanicSystemOnDGLoss.
0 - Default value and behavior, does not failover. (not halting the system).
1 - Halt the system if deactivation of volume group fails.
2 - Do not halt the system. Allow failover.
NOTE: Option 2 might cause data corruption if the paths are attached and the VG is recognized as imported to the problem node while the VG is already imported to the other node. Hence, ensure to make the appropriate choice to avoid corruption.
Steps to enable the PanicSystemOnVGLoss attribute:
1) Stop VCS.
2) Apply VCS HF. ( VRTSvcs 8.0.2.1501 )
3) Apply VCSAG HF. (VRTSvcsag 8.0.2.1501 )
4) Start the VCS.
5) Copy /opt/VRTSvcs/bin/sample_triggers/VRTSvcs/user_trigger_update_types to /opt/VRTSvcs/bin/triggers
# cp /opt/VRTSvcs/bin/sample_triggers/VRTSvcs/user_trigger_update_types /opt/VRTSvcs/bin/triggers
6) To update the types.cf execute the command below
# hatrigger -user_trigger_update_types 0
7) Apply the desired value for PanicSystemOnVGLoss (default value is 0).
# haconf -makerw
# hares -modify PanicSystemOnVGLoss <0/1/2>
# haconf -dump -makero
The LVMVolumeGroup resource fails to go offline/clean after all disk paths fail.
JIRA: STESC-8778