How the event messages recorded in the /etc/vx/dmpevents.log file can be written out of chronological order

book

Article ID: 100014966

calendar_today

Updated On:

Description

Error Message


DMP event timestamps example: (messages file vs dmpevents.log)


In this example, the HBA link events messages recorded at 20:52:36.000 in the dmpevents.log file supercede the DMP I/O analysis related messages documented at 20:52:32.099.


/etc/vx/dmpevents.log

Sat Jan 17 20:52:22.000: event received from the adapter port 10:00:00:90:fa:1c:aa:c2, fabricportid = 0x   0 event type is port online
Sat Jan 17 20:52:26.000: event received from the adapter port 10:00:00:90:fa:1c:b6:1b, fabricportid = 0x   0 event type is port offline
Sat Jan 17 20:52:28.000: event received from the adapter port 10:00:00:90:fa:1c:b6:1b, fabricportid = 0x   0 event type is port online
.
Sat Jan 17 20:52:36.000: HBA_EVENT_LINK_DOWN event received on the hba 1 Emulex-42C2071-2 fcid = 0x0
Sat Jan 17 20:52:36.000: HBA_EVENT_LINK_UP event received on the hba 1 Emulex-42C2071-2 fcid = 0x0
Sat Jan 17 20:52:36.000: HBA_EVENT_LINK_UP event received on the hba 2 Emulex-42C2071-3 fcid = 0x0

.
Sat Jan 17 20:52:32.099: I/O analysis done as DMP_IOTIMEOUT on Path sdjr belonging to Dmpnode emc1_07f0
Sat Jan 17 20:52:32.099: I/O error occurred (errno=0x206) on Dmpnode emc1_07f0
Sat Jan 17 20:52:32.100: I/O analysis done as DMP_IOTIMEOUT on Path sdjs belonging to Dmpnode emc1_07f1
Sat Jan 17 20:52:32.100: I/O analysis done as DMP_IOTIMEOUT on Path sdjs belonging to Dmpnode emc1_07f1
Sat Jan 17 20:52:32.100: I/O analysis done as DMP_IOTIMEOUT on Path sdjs belonging to Dmpnode emc1_07f1
Sat Jan 17 20:52:32.100: I/O error occurred (errno=0x206) on Dmpnode emc1_07f1
Sat Jan 17 20:52:32.100: I/O error occurred (errno=0x206) on Dmpnode emc1_07f1
Sat Jan 17 20:52:32.100: I/O analysis done as DMP_IOTIMEOUT on Path sdjq belonging to Dmpnode emc1_07ef
Sat Jan 17 20:52:32.100: I/O error occurred (errno=0x206) on Dmpnode emc1_07f1

vs

/var/log/messages:

Jan 17 20:52:11 BARNEY kernel: lpfc 0000:95:00.0: 2:(0):0713 SCSI layer issued Device Reset (3, 1) return x2002
Jan 17 20:52:16 BARNEY kernel: lpfc 0000:11:00.1: 1:(0):0713 SCSI layer issued Device Reset (1, 83) return x2002
Jan 17 20:52:22 BARNEY kernel: lpfc 0000:95:00.0: 2:1303 Link Up Event x1 received Data: x1 xf7 x10 x0 x0 x0 0
Jan 17 20:52:22 BARNEY kernel: lpfc 0000:95:00.0: 2:3172 SCSI layer issued Host Reset Data: x2002
Jan 17 20:52:27 BARNEY kernel: lpfc 0000:11:00.1: 1:3172 SCSI layer issued Host Reset Data: x2002
Jan 17 20:52:28 BARNEY kernel: lpfc 0000:11:00.1: 1:1303 Link Up Event x1 received Data: x1 xf7 x10 x0 x0 x0 0

 

NOTE: DMP is recording the following message using the event time; this may be later than the current system time in which the order DMP populates the dmpevents.log file.

Sat Jan 17 20:52:32.099: I/O analysis done as DMP_IOTIMEOUT on Path sdjr belonging to Dmpnode emc1_07f0

Cause


The messages recorded in the dmpevents.log file are logged from different DMP threads, so you may see a chronological time mismatch at times as the timestamp used, can be either the current time or the event time, however, the right content is available to aid debugging.
 

There are two possibilities why messages can be printed out of chronological sequence.

(1) A thread notes the event time as it happens, then does some more processing before calling the routine to log the actual related message.

In this case another thread having an event time after the first one might end up logging its message first.

(2) A thread notes the event time as it happens, but gets pre-empted for any reason.

In this case also another thread possibly having an event time after the first might log its message before the first one.


Having the exact time of the event can be more helpful when performing code analysis, than the time the message is logged.


NOTE: This can make the dmpevents.log file more difficult to read at times.
 

Resolution


As the DMP event messages are logged using both the event time (i.e. when the event was raised) and the current system time, the chronological ordering can be disordered.

Example

The message “I/O analysis done as DMP_IOTIMEOUT on...” is written to the dmpevents.log with the timestamp based on the vxesd event time.
 

The timestamp can tell the time the event happened, compared to the message sequence stating the time order.

The dmpevents.log messages could be adapted to embed the current time into the message header for all messages, and log the event time into the message body.
This would aid with maintaining the chronological order of the recorded messages, whilst recording the event time within the message body. The downside would be the additional overhead to restructure the message order and content layout.

 

 

 

Issue/Introduction

This article attempts to explain how and why DMP related event messages can be printed out of order (a message having a newer timestamp gets printed before an older one).
The eventsource daemon (vxesd) is a Volume Manager (VxVM) component process, first introduced in 4.0 for Solaris and 5.0 for all other Unix/Linux plaforms. In layman terms, vxesd is a 'helper' process that receives notifications of device related events which are used to take appropriate actions.

The benefits of vxesd include:
  • SAN Fabric event monitoring & pro-active error detection (SAN event)
  • DMP event logging (DMP event)
  • Automated device discovery (OS event)
  • iSCSI topology discovery
Traditionally DMP has taken a reactive approach to failing paths by only disabling paths when active I/O have failed on the storage sub system.
By using the Storage Networking Industry Association (SNIA) HBA API library, vxesd is able to receive SAN fabric events via the HBA.

This information allows DMP to take a pro-active role by checking 'suspect' devices from the SAN events, even if there is no active I/O.

DMP notifies vxesd of major events, which allows vxesd to log the event in a log file (/etc/vx/dmpevents.log).

These events include:
  • Marking paths/dmpnodes enabled
  • Marking paths/dmpnodes disabled
  • Throttling of paths
  • I/O error analysis
  • HBA/SAN events
The logfile is located in /var/adm/vx/dmpevents.log, but, is symbolically linked to /etc/vx/dmpevents.log.