How to optimize InfoScale Dynamic Multi-Pathing

book

Article ID: 100045835

calendar_today

Updated On:

Description

About tuning Dynamic Multi-Pathing

Dynamic Multi-Pathing (DMP) operation, and tuning DMP operation from a performance standpoint, can be thought of in terms of the following areas:

·        Device discovery Device discovery does the following things:

·        Uniquely identifies LUNs on multiple paths

·         Uniquely identifies LUNs into an enclosure

·        Identifies the model and type of LUN, such as A/A or ALUA, and applies load balancing policies as predefined in the Array Support Libraries (ASLs)

·        I/O load balancing DMP attempts to improve performance of I/Os issued on a DMP meta-device by balancing the I/Os on the multiple I/O paths for the meta-device. The way the load is balanced on the set of active paths for a meta-device is determined primarily by the DMP I/O policy in effect. There are many different I/O policies in DMP that an administrator can choose depending on the workload and operating environment; the default I/O policy generally works well for a range of workloads and operating environments.

·        Error detection and handling To deliver the higher availability and performance made possible by multiple I/O paths, DMP must handle a variety of errors appropriately. Much of DMP’s error handling behavior, such as timeout intervals and the number of retries in different error scenarios, can be modified using tunable parameters.

 

Warning:

In most cases, the default values of these parameters are appropriate and they should be changed with caution.

 

·        Path analysis The state of an I/O path (healthy or failed) can change due to errors in various components on the path and the components' subsequent recovery. The more up-to-date the information that DMP has on the state of the I/O paths, the better its I/O scheduling decisions can be. Path analysis in DMP keeps the state of paths reasonably up-to-date to enable better scheduling decisions. The path analysis behavior of DMP can be tuned to favor improved state information at the cost of extra CPU cycles and I/O, or the behavior can be tuned to favor lower CPU and I/O overhead, but with less up-to-date state information on I/O paths. DMP has a number of tunable parameters that determine aspects of path analysis. The default settings of these parameters provides behavior that is generally appropriate for most environments.

DMP can discover Fibre Channel events and monitor for Fibre Channel events. When a Fibre Channel event is received, DMP can, based on the discovered topology, identify the I/O paths that may be impacted by the event. DMP uses SCSI passthru interfaces for fast error handling capability and to get detailed error information in the case of I/O errors, which enables DMP to handle the errors appropriately. DMP can group I/O paths into Subpath Failover Groups (SFGs) and make collective decisions for a whole group. This feature is particularly useful in avoiding multiple I/O errors in cases where a fault affects a whole group of paths.

This article includes the following topics:

·        About tuning Dynamic Multi-Pathing

·        Dynamic Multi-Pathing device discovery

·        Dynamic Multi-Pathing I/O load balancing

·        Dynamic Multi-Pathing default I/O policy

·        Optimizing disk array cache usage with the balanced policy

·        Dynamic Multi-Pathing I/O policies

·        Dynamic Multi-Pathing I/O throttling

·        Tuning Dynamic Multi-Pathing error handling

·        Dynamic Multi-Pathing SCSI bypass

·        Dynamic Multi-Pathing I/O failure handling

·        Avoiding suspect paths in Dynamic Multi-Pathing

·        Dynamic Multi-Pathing tunable parameters for error handling

·        Dynamic Multi-Pathing path analysis

·        Subpath Failover Group

·        Path analysis on path errors and fabric events

·        Overview of path restoration

·        Default path restoration in Dynamic Multi-Pathing

·        Enabling or disabling path restoration

·        Path restoration policy

·        Tunable parameters for path restoration

·        Summary of Dynamic Multi-Pathing tuning

 

You can display and change online the tunable parameters that control DMP operation using the vxdmpadm command. Use the vxdmpadm gettune command to list the parameters and their values, and the vxdmpadm settune command to change the parameter values.

 

# vxdmpadm gettune all
            Tunable               Current Value  Default Value
------------------------------    -------------  -------------
dmp_cache_open                           on               on
dmp_daemon_count                         10               10
dmp_delayq_interval                      15               15
dmp_enable_restore                       on               on
dmp_fast_recovery                       off               on
dmp_health_time                          60               60
dmp_log_level                             1                1
dmp_low_impact_probe                     on               on
dmp_lun_retry_timeout                     0                0
dmp_path_age                            300              300
dmp_pathswitch_blks_shift                 9                9
dmp_probe_idle_lun                       on               on

These tunable parameters are system-wide, such that the parameters control how the DMP module operates in general for all storage devices. DMP also allows some aspects of its operation to be customized for an enclosure, array, or array-type by setting attributes for the enclosure, array, or array-type using the vxdmpadm setattr command. Where it is possible to do so, these provide fine-grained control over DMP operation.

DMP provides a template mechanism to simplify the process of performing tuning changes. The template mechanism allows administrators to save the configuration of a system to a tuning template file, which can then be applied on other systems where similar tuning is desired.

For more information about tuning templates, see the Veritas Dynamic Multi-Pathing Administrator's Guide.
 

Dynamic Multi-Pathing device discovery

Device discovery in Dynamic Multi-Pathing (DMP) is the process through which DMP identifies the I/O paths corresponding to each LUN. In device discovery, DMP examines disks discovered by the operating system and identifies the disks that represent different paths to the same LUN. For each LUN, DMP creates a new device in the operating system device tree; this device is the DMP meta-device for the LUN and can be used by higher layers of the storage stack to perform I/O on the LUN. For each I/O on a meta-device, DMP issues the I/O on one of the paths for the meta-device.

Device discovery in DMP is aided by array-specific Array Support Libraries (ASLs). ASLs are generally installed as part of Veritas Storage Foundation (SF).

The following tunable parameter optimizes the discovery process:

dmp_cache_open

The device discovery layer might need to send multiple requests to each device as part of the discovery process. When dmp_cache_open is set to on, the first open on a device by the ASL is cached and subsequent accesses can use the cached handle. Setting this parameter to off disables caching during device discovery and can slow the discovery process. The default and recommended value of this parameter is on.

DMP support for native multi-pathing requires that dmp_cache_open be set to on.

 

 

Dynamic Multi-Pathing I/O load balancing

Dynamic Multi-Pathing (DMP) balances I/Os issued on a meta-device across the multiple I/O paths for the meta-device. A DMP meta-device typically corresponds to a LUN. Some paths might be excluded from the load balancing process as specified below:

·        Paths that have been disabled by an administrator and paths discovered by DMP to have failed are not used.

·        Based on the array type, some paths may be excluded. For example, on an active-passive array, only the primary paths are used during normal operation; other paths are used only after a failover.

·        I/O throttling can cause some paths to be excluded from consideration temporarily.

In the case of asymmetric active-active arrays, the default behavior of DMP is to use only the primary paths for load balancing. However, administrators can specify use_all_paths=yes using the vxdmpadm setattr command to modify this behavior. When the use_all_paths attribute is set to yes, I/O requests can be scheduled on the secondary paths in addition to the primary paths. The use_all_paths attribute should be enabled only if recommended by the array vendor.

The way load balancing is performed in DMP is determined by the I/O policy in effect. DMP provides a number of different I/O policies. By default, the minimumq I/O policy is used for all meta-devices, but the I/O policy can be set for an enclosure, array, or array-type using the vxdmpadm setattr command.

 

Dynamic Multi-Pathing default I/O policy

The default I/O policy in Dynamic Multi-Pathing (DMP) is the minimumq policy. With this policy, when a new I/O is received for a meta-device, DMP schedules it on the path for the meta-device that has the minimum number of requests pending on it. Even though it is a simple criterion, the number of pending requests succinctly captures significant information about path performance, and the use of this criterion gives the minimumq policy the following beneficial characteristics:

·        When the multiple paths for a meta-device are all performing comparably, minimumq distributes I/O load evenly across the paths at high load.

·        When there is a difference in performance among the paths, better performing paths automatically get a higher proportion of the I/O load. These paths service requests faster and reduce the number of pending requests faster, and hence become eligible for more new I/Os.

·        When there is a change in the relative performance of the I/O paths, minimumq quickly adapts. If a path that had been performing well degrades in performance, the queue of requests on the path grows and minimumq automatically diverts most new I/O requests to other paths. This ability to adapt to changes in path performance is important because, in most environments, the performance delivered by paths changes over time. Errors in storage network components, which can cause re-routing of requests and create hotspots, changes in applications’ access patterns and load levels, and accesses from multiple servers to the same arrays can all cause slowing down of some paths compared to others.

While all DMP I/O policies can handle the failure of one or more paths, not all are able to adapt well when the performance characteristics of active paths change significantly; the ability to do this is a distinct advantage that minimumq has over many other policies. The minimumq policy has been seen to work as well as or better than other policies for a range of workloads and operating conditions. Symantec recommends that you use the minimumq I/O policy in DMP.

 

Optimizing disk array cache usage with the balanced policy

The balanced I/O policy in DMP is designed to balance the I/O load across the available paths while optimizing controller cache usage in the underlying disk arrays. DMP is frequently used with high-end, active-active (A-A) disk arrays with large caches in the disk array controllers. With an A-A array, DMP can send an I/O request on any path, to any of the controllers, unlike with an active-passive array where I/Os are sent to the secondary controller only after a failover. However, if an I/O request for a particular block is sent to one controller of an A-A array, and a later I/O request for the same block is sent a different controller, the block will likely reside in the caches of both controllers. The balanced I/O policy tries to avoid this cache duplication in the disk array by a careful mapping of blocks to paths. Requests for a particular block are always sent on the same path in this policy, but because different blocks map to different paths, load balancing is still achieved. In case of path failures, the mapping of blocks to paths is re-adjusted. The actual performance benefit from the balanced policy depends on many factors, including the cache management techniques of the disk array and nature of the workload. In those cases where the disk array caches are under pressure, the balanced I/O policy is a possible option for improving performance through optimized caching.

In the balanced I/O policy, each I/O request is mapped to a path based on the starting address of the I/O request. The mapping is based on a partition size, which can be specified when configuring the policy using the vxdmpadm setattr command. The mapping can be described as follows:

·        The storage space of the LUN (a disk or LUN can be thought of as a linear array of bytes) can be thought of as being divided into partitions, the size of each being the partition size.

·        When the starting address of a request is divided by the partition size, you get the partition to which the starting address belongs.

·        Each partition is mapped to an I/O path in a rotating fashion. For example, if there are 4 I/O paths, partition 0 is mapped to path 0, partition 1 to path 1, partition 2 to path 2 and partition 3 to path 3; partition 4 is mapped to path 0 again.

·        More formally, the I/O path on which an I/O request is to be sent is determined as follows: the starting address of the request is divided by the partition size to give the partition number; the partition number modulo the number of paths gives the path number on which the request is to be sent.

When the partition size for the balanced policy is not specified, the default partition size takes effect. The default partition size is the governed by the following tunable parameter:

dmp_pathswitch_blks_shift

This tunable parameter specifies the default partition size that applies in those cases where the partition size is not specified while configuring the balanced I/O policy for an enclosure, array, or array-type using the vxdmpadm setattr command. This tunable parameter is only relevant with the balanced I/O policy, not with other I/O policies. The value of this parameter is expressed as the integer exponent of a power of 2. The default value of this parameter is 9, which means the default partition size is 29; that is, 512 blocks or 256k. Increasing the value of this parameter by 1, doubles the default partition size. If the value of this parameter is set too high, it can result in uneven load distribution on the paths. If the application I/O activity is localized to an address range of the LUN and the partition size is too large, some of the paths may not get any I/O requests. If the value of this parameter is small relative to the I/O request size, many requests may span partitions, which is not desirable.

 

When the partition size is specified explicitly in the vxdmpadm setattr command, the value is rounded down to the nearest power of 2 and interpreted as the number of blocks in the partition. For example, if the partition size is specified as 1200, the partition size used will be 1024 blocks or 512k. If the partition size is specified as 0, the default partition size, which is based on the value of dmp_pathswitch_blks_shift, is used.

Since the balanced I/O policy tries to spread the I/O load equally among available I/O paths, it works well when the paths have roughly the same performance; it does not adapt well in those cases where some paths are performing poorly compared to others due to storage network problems or other issues. This is a drawback of the balanced policy compared to the default minimumq policy. The performance benefit from changing the default I/O policy to the balanced I/O policy varies depending on the workload and the cache management techniques of the disk array for which the change is made. In many cases, the performance benefit from the caching optimization might be small, giving minimumq an edge over the balanced policy. For these reasons, the minimumq policy is the default even for A-A arrays.

 

 

Dynamic Multi-Pathing I/O policies

The following I/O policies are available in Dynamic Multi-Pathing (DMP):

·        adaptive In this policy, DMP dynamically calculates recent path performance and assigns priorities to paths based on their performance. Specifically, DMP calculates throughput, or bytes delivered per second. Paths that deliver better performance are assigned a higher priority. I/Os are then routed so that higher priority paths get a greater share of the I/O load. As with the minimumq policy, this policy is able to adapt dynamically to changes in the storage environment that affect path performance. The bookkeeping overhead for this policy, which is the overhead involved in calculating path priorities and scheduling I/Os in proportion to priority, tends to be higher than the overhead for minimumq.

·        balanced This policy attempts to optimize disk array cache usage while balancing the I/O load on the available paths. The tunable parameter dmp_pathswitch_blks_shift is used with this policy. See "Optimizing disk array cache usage with the balanced policy"

·        minimumq This is the default and recommended I/O policy in DMP. See "Dynamic Multi-Pathing default I/O policy"

·        priority In this policy, administrators can manually assign priorities to paths and DMP will distribute I/Os among the paths proportionally based on their priority. The priority is an integer value; a higher value indicates that the path should be given a larger share of the I/O load. As an example, if there are two paths with priorities 1 and 2, the path with priority 2 will get two-thirds of the I/Os, while the other one will get a third. This policy may be useful in certain circumstances where administrators want to carefully control the I/O flow on paths. Since the policy is based on a static division of I/O load, it is difficult to employ this policy well in complex environments where unpredictable changes in path performance is common.

·        round-robin In this policy, I/Os are sent down different paths in a randomized fashion so that load is spread evenly across all paths. The overhead of path selection in this policy is less compared to minimumq. Hence, round-robin may be used instead of minimumq when minimizing CPU utilization is a priority.

·        singleactive In this policy, only one of the available paths is used for I/Os. If the active path fails, one of the other paths is made active. That is, this policy uses multiple paths only for high availability, not for load balancing. Only an outright path failure causes a new path to be chosen as the active path. In cases where the performance delivered by the active path drops, such as because of problems in the storage network, the policy continues to use the same path even though other paths may be capable of delivering better performance. 

 

Dynamic Multi-Pathing I/O throttling

I/O throttling is a mechanism by which Dynamic Multi-Pathing (DMP) temporarily stops issuing I/Os to paths that appear to be underperforming. An enclosure, array or array type can be configured so that I/Os to a path are throttled when the time for which a request has been outstanding on the path exceeds a specified I/O timeout. The timeout is specified when throttling is enabled by using the vxdmpadm setattr command. When throttling is enabled, DMP stops issuing I/Os on a path if there is at least one request that has been outstanding on the path for the specified timeout. I/O throttling requires I/O statistics gathering to be enabled. Timeout-based throttling can be useful in cases where one or more paths are responding slowly to I/Os; when configured properly, the throttling mechanism can limit I/Os sent to the underperforming paths.

The following example illustrates how timeout-based throttling works with 2 I/O paths (P1 and P2), with an I/O timeout of 10 seconds and minimumq as the I/O policy:

·        DMP gets a request, R1, which it issues on path P1. After 8 seconds, DMP receives a request, R2, which it issues on path P2, a request, R3, which it issues on path P1, and a request, R4, which it issues on path P2, following the minimum queue (the minimumq I/O policy) logic.

·        After another 3 seconds, which is 11 seconds after R1 was issued, DMP receives requests R5 and R6. Since the throttling timeout has been exceeded for path P1, DMP issues both R5 and R6 on path P2, even though P1 would have been the normal choice for one of the requests based on the minimum queue logic.

·        After another 2 seconds, R1 completes.

·        After another second, DMP receives a request R7. At this point, neither path P1 nor P2 has a request that has been outstanding for 10 seconds, so neither will be throttled. Since P1 has the smaller queue, DMP issues R7 on path P1.

The statistics obtained from DMP or from operating system utilities such as iostat can be used to determine whether it is appropriate to enable I/O throttling. If the I/O service times on one or a few paths are seen to be much higher than others on a fairly balanced request distribution, timeout-based throttling may be appropriate for the concerned enclosures. The timeout value can be based on the observed service times on healthy paths. A larger value of the I/O timeout generally allows more requests to get queued before throttling happens, essentially delaying the detection of slow paths. A smaller value of the timeout can limit the number of concurrent requests.

 

Tuning Dynamic Multi-Pathing error handling

The error handling capabilities of Dynamic Multi-Pathing (DMP) are key to its goal of providing the high availability and performance made possible by multiple I/O paths. Aspects of error handling in DMP can be tuned to get the behavior desired in a particular environment. In most cases, the default settings for the tunable parameters governing error handling work well, so they should be changed with caution. The following questions illustrate tuning decisions related to error handling that an administrator can make, along with the relevant tunable parameters:

·        Should DMP try to get detailed error information from the HBA interface in order to enable better error analysis (dmp_fast_recovery)?

·        How many times should DMP retry I/O on a path that is giving a path busy error before marking it as failed (dmp_retry_count)?

·        How long should DMP wait after sending a SCSI command before timing out (dmp_scsi_timeout)?

·        Should DMP detect and avoid intermittently failing paths (dmp_health_time and dmp_path_age)?

·        How long should DMP wait after a failover, such as with an active-passive array, before commencing I/Os (dmp_delayq_interval)?

·        How should DMP behave when all paths to a LUN have failed, which in some cases indicates an array undergoing maintenance (dmp_lun_retry_timeout)?

 

Dynamic Multi-Pathing SCSI bypass

A key to intelligent error analysis in Dynamic Multi-Pathing (DMP) is its SCSI bypass capability. For normal application I/O, DMP effectively sits on top of the SCSI layer. DMP performs path selection and then issues the I/O to the SCSI driver. But, in case an I/O fails, DMP can bypass the SCSI layer to get more detailed error information directly from the HBA interface. The SCSI bypass capability is enabled and disabled using the tunable parameter dmp_fast_recovery. SCSI bypass gives DMP access to error information that the SCSI layer would normally not pass on and allows DMP to make better decisions on how errors should be handled. Although provided as a tunable parameter, in most cases you should not disable this capability.

 

 

Dynamic Multi-Pathing I/O failure handling

When an I/O fails on a path, Dynamic Multi-Pathing (DMP) tries to route the I/O on an alternate path while performing error analysis in parallel. When an I/O fails on repeated retries, DMP will, at some point, decide to fail the I/O instead of retrying it on the available paths. The decision on when to fail an I/O is based on how I/O failure handling has been configured for the enclosure. The following are the two different methods of failure handling that can be configured:

·        Fixed retry method: In this method, DMP fails an I/O that has not succeeded after a fixed number of retries. The fixed-retry method can be configured for an enclosure, array, or array-type using the vxdmpadm setattr command, and the number of retries to use is specified as part of the command.

·        Time-bound method: In this method, when a certain time limit in seconds is reached, DMP stops retrying an I/O and fails it. The default time limit is 300 seconds. The time limit for the time-bound method can be changed for an individual enclosure, array or array-type using the vxdmpadm setattr command. Lowering the value of the time limit causes DMP to fail I/Os sooner; if the value is set too low, DMP might fail I/Os that could have succeeded.

 

Avoiding suspect paths in Dynamic Multi-Pathing

Dynamic Multi-Pathing’s (DMP) tries pro-actively to identify paths whose health is suspect and avoids these paths until their health is verified. To understand why this is important for good performance, consider the case where DMP issues an I/O on a path and the I/O fails because the path is faulty. The I/O might eventually succeed on another path, but the application will see a higher completion time for the I/O because of the time spent on the faulty path; this includes SCSI and HBA timeouts and DMP error processing time. If DMP could avoid the faulty path in the first place, performance would be better; of course, this is not always possible.

One way in which DMP tries pro-actively to identify and avoid I/O paths that might be faulty is by monitoring for Fibre Channel events that notify of errors in the Fibre Channel. This behavior is controlled by the tunable parameter dmp_monitor_fabric. DMP also uses the notion of Subpath Failover Group (SFG) along with the tunable parameter dmp_sfg_threshold to mark a whole group of related paths as suspect when it sees errors on some of the paths in the group.

 

DMP, by default, also tries to avoid scheduling I/Os on a path that is failing intermittently until it sees evidence that the state of the path is stable. When a path that was previously marked faulty is detected as healthy through one of DMP's path analysis features, DMP tracks the health of the path to make sure it is stable. If the state of the path changes to failed once again within a specified time, as specified by the tunable parameter dmp_health_time, the path is considered to be intermittently failing. In this case, DMP does not schedule I/Os on the path until the path is seen to stay healthy for a specified period, as specified by the tunable parameter dmp_path_age. The default value of dmp_health_time is 60 seconds and dmp_path_age is 300 seconds. If one or both of these tunable parameters is set to 0, DMP does not detect intermittently failing paths. The overhead of tracking paths that are failing intermittently is low and it is recommended that this feature be left enabled.

 

 

Dynamic Multi-Pathing tunable parameters for error handling

The error handling behavior of Dynamic Multi-Pathing (DMP) can be controlled using the following tunable parameters:

dmp_fast_recovery

This tunable parameter controls whether DMP tries to obtain error information directly from the HBA interface, bypassing the SCSI layer. For AIX, Linux, and Solaris, the default is on and is the recommended value. For HP-UX, the default value is off and is the recommended value.

dmp_retry_count

When a path returns a path busy error on an I/O dmp_retry_count number of times consecutively, DMP marks the path as failed. The default value of this parameter is 5.

dmp_delayq_interval

This is the time interval for which DMP waits before retrying an I/O in the case where an array fails over to a standby path. Some arrays are not capable of accepting I/O requests immediately after failover. The default value for this parameter is 15 seconds. This parameter should be changed only with proper knowledge of the characteristics of all the arrays attached to a system.

dmp_health_time

This parameter specifies the time in seconds for which a path must stay healthy; a path whose state changes from enabled to disabled within this time is marked intermittently failing. DMP does not enable I/Os on such paths until dmp_path_age seconds elapse. A value of 0 prevents DMP from detecting intermittently failing paths.

dmp_path_age

This parameter works with dmp_health_time as described above. A value of 0 prevents DMP from detecting intermittently failing paths.

dmp_lun_retry_timeout

This parameter determines DMP behavior in the event that all paths to a LUN have failed. This can sometimes be a transient error and can happen with some disk arrays during controller firmware upgrade, for example. This tunable parameter provides a way to handle such situations without failing the I/O outright. dmp_lun_retry_timeout specifies the time for which an I/O should be retried before failing it. The default is 0 which means I/Os are not retried in this situation. Instead of changing the default value of this parameter, it is recommended that enclosure-specific settings be changed for arrays that need this kind of handling.

dmp_scsi_timeout

This parameter specifies the timeout value for any SCSI command issued by DMP. The default value of this parameter is 30 seconds for Solaris and AIX, 20 seconds for Linux, and 60 for HP-UX. This parameter should generally be left at the default value unless some other parameter like the Fibre Channel timeout has been changed.

 

 

Dynamic Multi-Pathing path analysis

Path analysis helps improve the status information Dynamic Multi-Pathing (DMP) has on I/O paths, such as whether an I/O path is healthy or faulty. Good status information on I/O paths helps DMP provide better performance.            With good status information on I/O paths, DMP can proactively avoid faulty paths during path selection. If an I/O path is faulty, DMP discovers the fact when it schedules I/O on the path and the I/O fails. However, there is a performance penalty when I/O is issued on a path that is faulty: error detection itself involves timeouts and retries, following which the I/O must be re-issued on another path. In many cases, the status information on I/O paths gathered by the path analysis features help DMP avoid faulty paths.

When paths that were previously marked as failed are known to be healthy again, DMP can start using these paths in its load balancing logic. This gives DMP more options for scheduling I/Os and in general results in better performance. The path analysis features help DMP identify previously faulty paths that have recovered.

Path status updates happen in two ways in DMP:

·        In response to storage network fabric events and I/O errors.

·        Periodically, as part of a process called path restoration.

 The mechanism used for path analysis is path probing, where a SCSI inquiry command is issued on a path to determine whether it is healthy or faulty. If the inquiry command fails, the path can be assumed to be faulty. Issuing path probes and the associated processing has an overhead. This overhead is usually small, and can be managed using the tunable parameters for path analysis.

In some cases, DMP might mark the status of a path as suspect; this usually happens when DMP has encountered an error on a related path. DMP does not issue new I/Os on a suspect path until its status has been verified, unless the path is the last path available for the DMP node.

 

Subpath Failover Group

Dynamic Multi-Pathing (DMP) uses the concept of a Subpath Failover Group (SFG) to make path analysis more efficient. An SFG is a group of I/O paths from the same HBA port to the same array port. Essentially, paths in an SFG are paths to different LUNs that share the same sub-path from the HBA port to the array port, meaning that the paths have the same endpoints in the storage network. Even in storage environments with a large number of LUNs and a large number of I/O paths, there are likely to be relatively few SFGs. Since paths in the SFG all have the same route through the storage network, these paths are likely to fail and recover at the same time as faults and recovery happen in the network fabric. DMP is able to use this fact to optimize path analysis by sometimes taking collective action for all paths in an SFG rather than for each path individually.

 

Path analysis on path errors and fabric events

When Dynamic Multi-Pathing (DMP) encounters a path error on an I/O, it can initiate path state changes for other paths in the same Subpath Failover Group (SFG). This feature is governed by the following tunable parameter:

dmp_sfg_threshold

When DMP detects that the number of failed I/O paths in an SFG has reached this threshold, it marks all the paths in the SFG as suspect. DMP does not issue new I/Os on a suspect path until its state has been verified, unless the path is the last path available. This behavior of DMP helps proactively avoid faulty paths, since an error in one or more paths of an SFG likely points to a problem that might affect all paths in the SFG. If the dmp_sfg_threshold parameter is set to 0, path analysis based on SFG is disabled, which means that on a path error, DMP does not take any action for other paths in the same SFG. The default value of this parameter is 1, in which case the failure of any one path in an SFG causes other paths in the SFG to be marked as suspect.

 

DMP can also use network fabric monitoring to detect events that might affect the state of I/O paths. This feature is controlled by the following tunable parameter:

dmp_monitor_fabric

When this parameter is set to on, DMP monitors for Fibre Channel events and updates the path status based on events received. Typically, fabric events affect all paths in one of more SFGs rather than just an individual path. Fabric event monitoring works with Fibre Channel event information that DMP builds as part of discovery to identify the paths that might be affected by an event. Fabric monitoring uses the Storage Networking Industry Association (SNIA) HBA API. This vendor and platform specific HBA-API library must be available for this feature to work. The default value of dmp_monitor_fabric on Solaris, HP-UX, and Linux is on, and on these platforms Symantec recommends that you set this parameter to on. The default value of dmp_monitor_fabric on AIX is off, and Symantec recommends that you set this parameter to off on AIX to avoid performance issues.

 

 

Overview of path restoration

The main component of path analysis in Dynamic Multi-Pathing (DMP) is path restoration, in which a kernel thread that is named the restore daemon periodically issues probes on a collection of I/O paths to determine their state. Path restoration is a useful mechanism that augments other ways in which DMP collects status information about I/O paths, namely as part of regular I/Os, some of which might return an error, and by monitoring fabric events.

Tuning path restoration is mainly about achieving the desired balance between better status information on paths and lower overhead due to path restoration: more frequent and comprehensive path probing results in better status information, but adds more overhead.

 

Default path restoration in Dynamic Multi-Pathing

Path restoration is enabled by default in Dynamic Multi-Pathing (DMP); the tunable parameter dmp_enable_restore, which can be used to turn on or turn off path restoration, has the value of on by default. The default behavior for path restoration is aimed at probing only a subset of the I/O paths rather than probing all paths; the paths probed are those that are likely to benefit DMP operation the most. In most cases, the default behavior of path restoration has low overhead, but yields the following benefits:

·        Path probing happens at a relatively large interval of 300 seconds, which is the default value of the tunable parameter dmp_restore_interval. The restore daemon wakes up once in this interval and initiates probing of I/O paths. Since the default value of the interval is large, it helps keep path restoration overhead low.

·                       Rather than probe all I/O paths, the default setting results in probing of the following paths in each interval: As a result, the number of paths probed in each interval by the restore daemon is usually much less than the total number of paths; this keeps the overhead of path probing low. This default behavior of probing failed, suspect, and idle paths results from the default values of two tunable parameters: dmp_restore_policy, which specifies the path restoration policy and has default value of check_disabled, and dmp_probe_idle_lun, which has a default value of on. In addition, I/O statistics gathering has to be enabled for dmp_probe_idle_lun to have effect. With I/O statistics gathering enabled and dmp_probe_idle_lun=on, the check_disabled policy probes failed and suspect paths, as well as paths for idle LUNs.

·        Paths that have been marked as failed due to previous errors

·        Paths that have been marked suspect

·        Paths for LUNs that have been idle for a while, meaning the LUNs have had no I/Os in a while

·        The rationale for focusing on failed paths, suspect paths, and idle LUN paths is that these are the paths for which DMP’s status information is most likely to be outdated and hence probing these paths is likely to give the most benefit. The status information that DMP has on active paths on which I/Os are being issued is generally likely to be good; there is not much to be gained by additional periodic probing of these paths as part of path restoration.

·        By default, DMP uses an optimization called low impact path probing that relies on the concept of the SFG to further reduce overhead of path probing. This optimization allows DMP to probe only a few paths per SFG and infer the state of other paths in the SFG based on these few probes. The low impact path probing optimization is controlled by two tunable parameters: dmp_low_impact_probe, which enables or disables this feature and has the default value of true, meaning that the feature is enabled, and dmp_probe_threshold, which decides the number of paths probed in each SFG and has the default value of 5.

 

Enabling or disabling path restoration

Path restoration can be enabled or disabled using the tunable parameter dmp_enable_restore. Path restoration is enabled by default, and is an important mechanism by which Dynamic Multi-Pathing (DMP) updates the status information on I/O paths. If path restoration is disabled, the status information that DMP has on the paths is not as accurate as when path restoration is enabled. In some cases, this may result in DMP sending I/Os down faulty paths. The bigger problem with disabling path restoration is that most paths that had failed but have now recovered are not recognized as healthy by DMP; over time this can create a scenario in which DMP has very few scheduling options and therefore operates sub-optimally.

Symantec recommends that you do not disable path restoration. If the performance overhead of path restoration is a concern, you should keep path restoration enabled and tune other parameters to reduce overhead. The default settings for path restoration result in low overhead and are appropriate for most environments, but you can further reduce the overhead of path restoration through tuning.

 

Path restoration policy

When path restoration is enabled, the restore policy determines which paths are probed. The value of the tunable parameter dmp_restore_policy determines which path restoration policy is in effect. You can specify the following parameter values:

check_disabled

This is the default path restoration policy. It is a low overhead policy that mainly probes paths that have been marked as failed or as suspect due to previous errors to see if these paths are now healthy. The overhead of the policy can be further reduced by enabling the low impact path probing optimization via the dmp_low_impact_probe parameter, which uses the notion of an SFG to probe a subset of suspect paths rather than all of them. If idle LUN probing, which is set by the dmp_probe_idle_lun parameter, and statistics gathering are enabled, paths for idle LUNs are probed in addition to failed and suspect paths.

See "Tunable parameters for path restoration"

The status information on I/O paths gathered by this policy complements the status information that DMP gets as a result of regular application I/O. Regular I/O on active paths gives DMP knowledge of the health of those paths, and the check_disabled policy in path restoration gives DMP current status information on previously failed and suspect paths, and optionally, on paths to idle LUNs. If the DMP I/O policy in effect does not distribute I/O load on available paths, such as due to the single-active I/O policy, the check_disabled path restoration policy results in limited status information on important I/O paths. However, the default I/O policy (minimumq) does a good job of distributing I/Os on all available paths, so the check_disabled policy’s focus on probing only failed and suspect paths and idle LUNs works well with the default I/O policy. With active/passive arrays, status information on secondary paths is not very good when the check_disabled policy is in effect; this is usually acceptable since secondary paths come into play only after a controller failover.

check_all

In this policy, all I/O paths are probed in each interval. This policy gives the best status information of all the path restoration policies, but its overhead can be a concern especially in environments with a large number of I/O paths.

check_periodic

This policy tries to get the benefits of both check_disabled and check_all. It uses check_all only once every N intervals, where N is the value of the tunable parameter dmp_restore_cycles; other intervals use the check_disabled policy. The default value of dmp_restore_cycles is 10. Thus, by default, check_periodic incurs the higher overhead of the check_all policy only in 1 of 10 intervals; in the other 9 intervals, it uses the low overhead check_disabled policy. For the intervals in which the check_disabled policy is in effect, low impact probing and idle LUN probing will be used depending on whether these are enabled as based on the values of the relevant tunable parameters. The check_periodic policy offers a way to benefit from the low overhead of check_disabled most of the time, while infrequently collecting the more complete status information that check_all provides. For environments with a large number of paths, the overhead in the interval with the check_all policy can still be a concern.

check_alternate

This policy checks that there are at least two healthy paths for each DMP node. Typically, a DMP node corresponds to a LUN. In effect, the policy tries to ensure that in the event of a single path failure, there will be another healthy alternate path to provide availability. The policy probes I/O paths for each DMP node until it finds two paths that are healthy. If the policy is not able to find two healthy paths, it generates a notification. If the policy finds two healthy paths, it does not probe more paths for the DMP node. In the ideal case where there are no faulty paths, this policy probes exactly two paths per DMP node; in real life scenarios where some of the paths may be faulty, the number of paths probed by this policy will be somewhat more. In environments with a large number of LUNs and very few I/O path failures, the check_disabled policy can have a lower overhead than the check_alternate policy.

 

 

Tunable parameters for path restoration

The following table contains the complete list of tunable parameters related to path restoration:

dmp_enable_restore

This tunable parameter enables or disables path restoration. When path restoration is enabled, a restore daemon is started that periodically probes I/O paths and updates their status based on the results of the probes. Path restoration can be disabled by setting dmp_enable_restore to off. Other path mechanisms in DMP for updating path status, such as updating path status based on fabric events, continue to function independently even when path restoration is disabled. To disable path restoration when the restore daemon is already running, use the vxdmpadm stop restore command. Since path restoration is a useful mechanism by which DMP keeps its path status information updated, Symantec does not recommend disabling path restoration. If the overhead of path restoration is a concern, other parameters for path restoration can be tuned to reduce this overhead. The default value of dmp_enable_restore is on, meaning that path restoration is enabled by default.

dmp_restore_interval

This tunable parameter specifies the interval at which the restore daemon wakes up and initiates path probing. Increasing the value of this parameter decreases path restoration overhead since path probing is initiated less frequently. Decreasing the value of this parameter below the default value is generally not recommended as it can increase path restoration overhead and affect performance adversely. The default value is 300 seconds.

dmp_restore_policy

This tunable parameter specifies the path restoration policy, which largely determines which paths are probed as part of path restoration. Path restoration policies are described in more detail in a separate section above. The default policy is check_disabled, which probes paths that have been marked as failed or as suspect, and based on other settings can probe the I/O paths of idle LUNs.

dmp_restore_cycles

This tunable parameter takes effect with the check_periodic policy.

See "Path restoration policy"

 Since the default policy is check_disabled, this parameter does not affect path restoration unless the policy is changed via the dmp_restore_policy tunable parameter . If dmp_restore_cycles=N and the policy in effect is the check_periodic policy, then DMP uses the check_disabled policy for N-1 intervals, and then uses the check_all policy in the Nth interval; this pattern of switching between check_disabled and check_all is repeated. In other words, the check_all policy is used once every dmp_restore_cycles number of intervals and check_disabled policy is used in other intervals. The default value of dmp_restore_cycles is 10.

dmp_low_impact_probe

This parameter is used to enable or disable the low impact path probing feature of DMP, which uses the concept of a Subpath Failover Group (SFG) to optimize path probing. Low impact probing is relevant only with the check_disabled policy, and in the check_disabled phases of the check_periodic policy. This feature reduces path probing overhead in the case of suspect paths. When low impact path probing is enabled, instead of probing all suspect paths, DMP probes only a certain number of suspect paths as specified by the tunable parameter dmp_probe_threshold per SFG. If all probed paths turn out to be in failed state, the status of all suspect paths in the SFG is set to failed; if at least one suspect path turns out to be healthy, the remaining suspect paths are probed individually. The default value of dmp_low_impact_probe is on, meaning that the feature is enabled by default.

dmp_probe_threshold

This parameter determines the number of paths to probe per SFG when low impact probing is enabled. The default value of this parameter is 5.

dmp_probe_idle_lun

This parameter is used to turn on or turn off probing of paths to idle LUNs by the restore daemon. Without idle LUN probing, the status information that DMP has for paths to an idle LUN may not be recent because of the fact that there have been no recent I/Os to the LUN. Hence, if a LUN that was previously idle becomes active, meaning that the LUN receives I/Os, DMP’s scheduling of I/Os may be sub-optimal initially until it builds a clearer picture of path statuses for the LUN. An administrator can choose to maintain improved status information for idle LUNs by keeping idle LUN probing on. Alternately, an administrator can choose to reduce path restoration overhead by turning off idle LUN probing. In environments where you know that many LUNs will continue to be idle, turning off idle LUN might improve performance; however, for the LUNs that become active later, the initial I/Os after the LUNs have become active might incur penalties because the status information on the paths to the LUNs was outdated. For this parameter to take effect, DMP statistics gathering must be enabled. If statistics gathering is not enabled, DMP is not able to identify idle LUNs and this parameter is considered to be off. This parameter is relevant mainly when the path restoration policy is check_disabled. When the policy is check_all, all paths--including paths for idle LUNs--are probed anyway. The default value is on.

 

 

Summary of Dynamic Multi-Pathing tuning

Dynamic Multi-Pathing (DMP) has a large number of tunable parameters and configuration options that have an impact on performance. However, in most cases, the default settings are appropriate. Many of the tunable parameters rarely need to be changed. The following list briefly summarizes some of the more common performance tuning changes that an administrator might perform:

·        The I/O policy in DMP determines how load is balanced across multiple paths for better performance. The default I/O policy, minimumq, is the recommended I/O policy. Among the strengths of this policy is its ability to adapt well when there are fluctuations in the performance delivered by paths by automatically redirecting more I/Os to better performing paths. The balanced and round-robin I/O policies have strengths that make them reasonable alternatives to minimumq under certain conditions.

·        When one or more paths to a LUN are performing poorly compared to the others, a load-balancing policy such as minimumq automatically redirects much of the I/O load away from the poorly performing paths. The negative performance impact of slow paths can be further reduced by configuring timeout-based I/O throttling for the relevant enclosures.

·        Path restoration in DMP can be tuned for better state information on I/O paths or for lower operating overhead; the default setting provides a good balance that is suited for most complex environments. In some environments, the overhead of path restoration can be reduced by turning off idle LUN probing, especially if the LUNs are expected to continue to be idle.

·        The tunable parameter dmp_daemon_count determines the number of kernel threads used for DMP administrative activities. On high end servers, increasing the value of this parameter can improve performance.

 

Issue/Introduction

How to optimize InfoScale Dynamic Multi-Pathing