Disk Label and Data corruption can occur when "dmpfs" (temporary file system) is un-mounted

book

Article ID: 100015161

calendar_today

Updated On:

Description

Error Message

In /var/adm/messages you will get the following errors

scsi:  Label says 71131264 blocks; Drive says 4194304 blocks

scsi: WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w500507680230e09f,1 (ssd61):


VxVM  command vxdisk list  will display incorrect information as well. 

bash-3.2# vxdisk list

DEVICE       TYPE            DISK         GROUP        STATUS

disk_0       auto:SVM        -            -            SVM

disk_1       auto:cdsdisk    -            -            online invalid   <<<<<<  was an SVM disk

disk_2       auto:cdsdisk    -            -            online invalid

storwizev70000_002eca auto:cdsdisk    -            -            online

storwizev70000_002ec6 auto:cdsdisk    storwizev70000_002ec6  testdg      online

storwizev70001_002ec9 auto:cdsdisk    storwizev70001_002ec9  tempdg       online

storwizev70002_002ec8 auto            -            -            error  <<<<<<<<<<<<<<<<< disk group go offline and lun goes into error >>>>>>


Data/Applications will become offline immediately after the following:

umount  /dev/vx/rdmp

umount /dev/vx/dmp

vxdisk -o alldgs list 

Cause

Both /dev/vx/dmp and /dev/vx/rdmp are mounted as temporary file systems  in swap space during boot cycle.   This is mounted by the vxvm-sysboot startup script.   VxVM then builds the dmp device entries on the mounted dmp directory in swap space.  When /dev/vx/[r]dmp are un-mounted from swap, the stale dmp device entries on the underlying UFS directory would be exposed.  

A command  like vxdisk -o alldgs list is run, VxVM inquiries the device for information using the major/minor number stored in vxvm vxconfigd kernel.  Since the information on the UFS file system directory is different  than the information in kernel VxVM will open the wrong dmpnode.

The following could happen

1) VxVM flushes of the wrong label to disk

2) VxVM auto-adjust code gets triggered and modification to label are done

3) VxVM sends the wrong IO to the wrong dmpnode 

There is no need to be concerned if there is no data in the root UFS filesystem /dev/vx/[r]dmp directories.  This directory is meant to be blank and the dmpnode device tree is meant to be written to temporary swap space upon every reboot.

Resolution

A supported hotfix has been made available for this issue. Please contact Veritas Technical Support to obtain this fix. This hotfix has not yet gone through any extensive Q&A testing. Consequently, if you are not adversely affected by this problem and have a satisfactory temporary workaround in place, we recommend that you wait for the public release of this hotfix.

Veritas Technologies LLC currently plans to address this issue by way of a patch or hotfix to the current version of the software. Please note that Veritas Technologies LLC reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests. Veritas’ plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk.

As a workaround the below manual steps can be performed :

1)  cp /etc/vfstab /etc/vfstab.preVTOC
         comment out any Veritas file systems from attempting to mount (not applicable if all Veritas file systems are controlled by Veritas Cluster Server).
2)  Operators to stop applications, backgrounds databases, halt zones if applicable
       Freeze cluster from starting on bootup 
       Shutdown cluster on problem node

3)  Stop VxVM from starting on Reboot
        touch /etc/vx/reconfig.d/state.d/install-db  # This command prevents the tmpfs from mounting
4) Reboot System
        shutdown -y –g0 –i6
5)  Check if dmp is mounted to swap
            mount –v | grep dmp
    How to check if dmpfs is mounted to swap:
     Example:
          mount -v | grep dmp     # you should not be seeing the below entries#
          swap on /dev/vx/dmp type tmpfs read/write/setuid/devices/rstchown/xattr/dev=6100002 on Mon Feb  9 20:53:06 2015
          swap on /dev/vx/rdmp type tmpfs read/write/setuid/devices/rstchown/xattr/dev=6100003 on Mon Feb  9 20:53:06 2015

 
If DMP is mounted – STOP and engage Veritas Technical Support.
 
6)  Check the directories for dmpnode information
         ls –al /dev/vx/rdmp/
        ls –al /dev/vx/dmp/
No entries should be in this directory
 
7)   Remove the stale entries
        rm /dev/vx/rdmp/*
        rm /dev/vx/dmp/*

 
8) Remove the file that prevent VxVM from starting.
rm /etc/vx/reconfig.d/state.d/install-db

9) copy back the /etc/vfstab  if it was modified.
10)   Reboot System to bring it backup
        shutdown -y –g0 –i6
 
11) Start cluster; unfreeze cluster

12) Check Everything mounted
13) Confirm the two tmpfs mount points are present 
 #mount -v | grep dmp

 

 

Applies To

 This is applicable to those systems where VxVM product is upgraded and have stale data in UFS filesystem directory /dev/vx/[r]dmp. There is no issue for any fresh installation of the product.

Issue/Introduction

The /dev/vx/[r]dmp directories, where the DMP device entries are created, are mounted as tmpfs (swap) during boot cycle. These are un-mounted during the upgrade process of VxVM, as the services are stopped. These directories are not re-mounted during the completion phase of product upgrade. As a result of this the dmp device entries are created on the UFS file system, root directory /dev/vx/[r]dmp. These entries would become stale over several boot cycles. If the admin/user un-mounts the /dev/vx/[r]dmp directories at a later point of time, the VxVM could access these stale entries from UFS root directory /dev/vx/[r]dmp, which would lead to corruption in VTOC/data.

Additional Information

ETrack: 3727939