Using the VxFS Cached ODM or Cached QIO features on Solaris or AIX, a rare condition exists where it is possible to corrupt a page of information

book

Article ID: 100014499

calendar_today

Updated On:

Resolution

Veritas recommendation:

If you are using either Cached QIO or Cached ODM please disable these features using the vxtunefs command and by removing the setting of these tunables from tunefstab.

 

How to determine if the Cached ODM or Cached QIO features are in use:

The Cached ODM and Cached QIO features are disabled by default. To use either of these features you need to enable it explicitly using the vxtunefs command or by adding an appropriate entry to the /etc/vx/tunefstab file. Note that Cached QIO and Cached ODM have exactly the same enablement mechanism, so enabling one enables the other.

You can determine if these features are enabled using the following command for a file system that is currently mounted (example only):

# vxtunefs | grep qio_cache_enable

qio_cache_enable=1

 

If the output string contains "qio_cache_enable=1" both Cached ODM and Cached QIO are enabled for the file system.

You can determine if either of these features is automatically enabled at mount time using the following command (example only):

 # grep cache_enable /etc/vx/tunefstab

/dev/vx/dsk/datadg/db03  read_pref_io=128k,odm_cache_enable=1

/dev/vx/dsk/datadg/db04  read_pref_io=256k,qio_cache_enable=1

 

If the output string contains either "odm_cache_enable=1" or “qio_cache_enable=1” then Cached ODM and Cached QIO are enabled for the corresponding file system.

Please refer to the man pages for vxtunefs(1M) and tunefstab(4) for more details.

 

How to disable these features:

To disable Cached ODM and Cached QIO for a file system that is currently mounted, execute the command:

# vxtunefs -s -o qio_cache_enable=0

 

To prevent these features from being automatically enabled at mount time, edit the /etc/vx/tunefstab file and remove “odm_cache_enable=1” or “qio_cache_enable=1” from any line on which they appear, for example:

# grep cache_enable /etc/vx/tunefstab

/dev/vx/dsk/datadg/db03  read_pref_io=128k,odm_cache_enable=1

/dev/vx/dsk/datadg/db04  qio_cache_enable=1,read_pref_io=256k

 

After editing /etc/vx/tunefstab, the example lines would appear as follows:

/dev/vx/dsk/datadg/db03  read_pref_io=128k

/dev/vx/dsk/datadg/db04  read_pref_io=256k

 

Is there a patch available:

Veritas are developing patches for this issue. If you require a patch, please contact Veritas Technical Support.

 

 

Applies To

Impacted product releases: 

Storage Foundation (SF) 4.x, 5.x & 6.x

Storage Foundation Cluster File System (SFCFS) 4.x, 5.x & 6.x

Storage Foundation for Oracle RAC (SFRAC) 4.x, 5.x & 6.x

Storage Foundation for Oracle (SFORA) 4.x, 5.x & 6.x

 

The issue only occurs when using the Cached QIO or Cached ODM features of VERITAS File System (VxFS). 

Cached ODM was first introduced in the 5.1 release.

 

Impacted operating systems platforms:

AIX and Solaris (SPARC and x86).

 

Issue/Introduction

Veritas have identified a rare condition that can result in a page of data in memory becoming corrupt, which may then be written to disk, whilst using either the VxFS Cached Oracle Disk Manager (CODM) or Cached QuickIO (CQIO) features on either the Solaris or AIX operating system platforms. It is extremely unlikely that a customer has run into this issue and so far Veritas are unaware of any instance of this issue being reported by a customer. Furthermore the issue can occur only if the system is under severe page cache pressure. Cached ODM (Oracle Disk Manager) For the issue to occur using Cached ODM specific conditions are required. The issue does not exist in Cached ODM if the oracle block size is equal to or a multiple of the platform PAGESIZE, with the following exceptions: 1. If Cached ODM is enabled for the oracle database transaction (redo) logs, then the issue will still exist regardless of the oracle block size – however in this case only the redo log can become corrupted. 2. If oracle files are concurrently being accessed using Cached ODM and using POSIX system calls performing file system buffered read i/o [i.e. buffered i/o read() system calls] outside of the normal oracle instance i/o (i.e ODM i/o) then corruption can occur – for example: a. Performing tar to backup the database files whilst the oracle instance is also updating the database, whilst using Cached ODM enabled for database files, can corrupt the database files. b. Running replication tools such as Oracle Goldengate, whilst using Cached ODM for redo logs, can corrupt the redo logs. The default oracle block size is 8Kb and can be configured as small as 2Kb, the default oracle block size is usually chosen by database admins. The Solaris SPARC PAGESIZE is 8Kb The Solaris x86 PAGESIZE is 4Kb The AIX PAGESIZE is 4Kb