LDOM: How to handle the loss of Primary (Control) and Service I/O Domains when using Veritas DMP

book

Article ID: 100033972

calendar_today

Updated On:

Description

Description

LDOMs known as Oracle VM Server for Sparc
 
Oracle's server virtualization and partitioning technology for SPARC. Provides the ability to split a single physical system into multiple, independent virtual systems (known as logical domains).
 
Figure 1.0

User-added image
Enables a system to run and deploy different Operating System instances simultaneously on a single server

 

Solaris Enhancement

Oracle has released an enhancement for Solaris 11.3 with SRU 18.0.6 which enables Veritas Dynamic Multi-Pathing (DMP) to better handle the loss of  Primary (Control) and Service I/O Domains.
 

Control (Primary) / Service domain is not accessible


Commands such as “echo | format”, “vxdisk scandisks” may hang, until the impacted I/O domain returns.

 

Sample Configuration using Veritas DMP and InfoScale

The Control (primary) and Service (altio) I/O domains will both present device paths to the logical domain
Device access is made possible and visible from the logical domain (scooby).

Figure 2.0


User-added image

 

With LDOMs, each logical domain is able to run a different Operating system (OS) release and update. Each LDOM can have more CPU and memory allocated to it, to service the environment specifics for the application. This allows each LDOM environment to be patched independently from each other.

 

Figure 3.0


LDOM configuration using DMP

 

In the above example, both I/O domains are exporting the SAN disk via a DMP path from the Control and Service domain. Multiple paths are exported to the GUEST. The GUEST will only see a single OS device handle per exported DMPNODE from each I/O domain. In other words, the DMPNODE from the Control domain will create one OS device handle in the GUEST. The DMPNODE from the Service domain will create another OS device handle in the GUEST for the same device. As the GUEST domain cannot get the failed I/O(s) back, they cannot be routed through the alternate operational I/O domain.
 

VDC Timeout

In Oracle VM Server configurations the Virtual Disk Client (VDC) driver timeout is set to zero by default, which signifies infinity.

Impact:

This can cause the failed I/O not to return to the guest domain in the event that either the Control (primary) or Service I/O domain crashes (reboots) unexpectedly. Even with the VDC timeout defined, the NEW timeout-noretry-list parameter made available with Solaris 11.3 SRU 18.0.6 needs to be defined for all virtual disk instances.

 

DMP_NATIVE_SUPPORT

To support the handling of Solaris ZFS boot devices, the DMP tunable dmp_native support must be enabled in Solaris 11 hosts, where the use of MPGROUPs is not permitted by the Veritas product suite. For Solaris 10 ZFS based boot devices, dmp_native_support should not be enabled unless the boot disk is presented to the Solaris 10 LDOM GUEST via a MPGROUP.

# vxdmpadm gettune dmp_native_support
# vxdmpadm settune dmp_native_support=on


NOTE: A series of Veritas Volume Manager (VxVM) patches were released to ensure DMP imports ZFS devices using DMP.

Please contact Veritas support to ensure you are running the required Veritas Volume Manager (VxVM) patch level.


When applying the latest patch perform the following steps:

  1. Ensure dmp_native_support is first disabled
  2. Apply the latest available VxVM patch applied
  3. Reboot the system and re-enable dmp_native_support

MPGROUPs remains unsupported for Solaris 11 configurations. MPxIO is not supported with Solaris 10 or 11 LDOM configurations.

 

Additional Information (Solaris 11):

The solution for Solaris 11 deployments is documented below.

The key thing here is that NO MPGROUPs should be configured, and the ZFS devices presented to the Solaris 11 GUEST domain need to have the dmp_native_support tunable enabled for DMP.

Issue/Introduction

LDOM: How to handle the loss of Primary (Control) and Service I/O Domains when using Veritas DMP