I/O retry(xx) on Path <device> belong to Dmpnode <dmpnode>

book

Article ID: 100008392

calendar_today

Updated On:

Description

Error Message

 Mon Dec 12 16:46:13.517: I/O retry(3) on Path sdl belonging to Dmpnode emc_clariion0_126

Thu Oct 20 00:19:00.733: I/O retry(1796406) on Path sdj belonging to Dmpnode emc_clariion0_121

Resolution

There are two recovery options, and below is only for timebound, and fixed retry doesn't do any recalculation.

The message shows calculated retry count using below formula with :

Number of retries = (io_timeout / DMP total_time)

 

io_timeout is defined as recovery option on each enclosure, and you can check current value with below command:

# vxdmpadm getattr enclosure emc_clariion0 recoveryoption
ENCLR-NAME      RECOVERY-OPTION      DEFAULT[VAL]  CURRENT[VAL]
===============================================================
emc_clariion0  Throttle             Nothrottle[0]  Nothrottle[0]
emc_clariion0  Error-Retry          Timebound[300] Timebound[300]

 

DMP total_time is the actual time of DMP sending the I/O to the device, but it was failed.  The time cannot be calculated by looking at the messages as the log only shows failed I/O, so unable to see start and finish time.

For example, retry(3) would indicate that the retry value was 4 as the message is printed out after reduce the retry count by 1.  So, using above formula, we can get DMP total_time as 60~75 seconds:

300 / 75(=4) ~ 300 / 60(=5)  (result is rounded)

retry(1796406) would have DMP total_time as about 0.000167(=300/1796407)

 

So, if the DMP I/O took longer than io_timeout, then retry count will be zero, and no further retry.  But if the DMP I/O took very short time, then retry count will be huge value and dmp will try a lot, but remember the retry count is always re-calculate based on last DMP I/O time, so it's necessary to repeat so many times.  The timebound retry is used to limit the retry times according to the device or SCSI I/O performance on the system. Fixed retry will be straight forward and just reduce retry count, and there is no re-calculation.

 


Applies To

Redhat 5 / SF5.1SP1

Issue/Introduction

When DMP encounter I/O error, it retry I/O with certain condition, and shows message like below in /etc/vx/dmpevents.log