Asynchronous, sequential writes, that extend the size of a file, will create dirty memory pages. New extents can be allocated when the dirty pages are flushed to disk, via background processing, rather than allocating the extents in the same context as the write I/O. The dalloc feature therefore delays the extent allocation until the dirty pages are flushed to disk. The dalloc feature allows VxFS to write to a file even when the allocation has not happened.
When the dalloc feature is enabled on VxFS, a missed write could occur when certain part of the in-memory data fails to reach the disk, due to some internal miscalculation. This could lead to silent data corruption by a lost write.
On Solaris, there are no visible symptoms to trace the corruption, unless there are backups of the files, which can be used for comparison.
On AIX, and Linux, after a file is generated, through asynchronous, sequential write operations, the file appears to be intact if it is read immediately after it is created. However, if the file is read again after a significant time gap, part of the file will be replaced with zeros.
This is because an asynchronous write is a buffered write. The written data is buffered in the kernel page cache memory, so the reads are correct until the page cache is reused. The buffered data is replaced with something else, and the original data is lost. The file in such a case has a hole in place of the lost data. Any application reading this data receives zeros.
This happens on Storage Foundation, with VxFS 6.0.5.100, 6.1, 6.2 and subsequent hot fixes, where dalloc is enabled.
Cluster File Systems (CFS) are not at risk.
1. When the problem occurs, the extent map (space allocation map) of the file will show a hole in it. The extent map of the file can be checked with fsmap.
/opt/VRTS/bin/fsmap -a
Example:
# /opt/VRTS/bin/fsmap -a /avol01/datafile Volume Extent Type File Offset Extent Size File.... avol01 Data 1310720000 8626176 /avol01/datafile avol01 Data 1319346176 24928256 /avol01/datafile -1344274432 131072 /avol01/datafile avol01 Data 1344405504 8626176 /avol01/datafile avol01 Data 1353031680 24928256 /avol01/datafile
2. Using the file system debugger fsdb.
First, get the inode number of the file, using the command ls -li.
# ls -li /avol01/datafile2214 -rw-r--r--. 1 root root 1377959936 Aug 23 16:18 /avol01/datafile
Check the extent map with fsdb.
# echo '2214i.mapall' | /opt/VRTS/bin/fsdb /dev/vx/rdsk/adg/avol01offset device block length plength.... 1310720000 0 24344 8424 8424 1319346176 0 32768 24344 24344 1344274432 - HOLE 128 1344405504 0 57112 8424 8424 1353031680 0 65536 24344 24344
When an application reads the part of the file that corresponds to the hole in the extent map, VxFS will returns zeros for that part of the file. This is by design.
# vxtunefs -s -o dalloc_enable=0 $MOUNT_POINT
# vxtunefs -s -o dalloc_enable=0 /testmnt1
/dev/vx/dsk/$DISKGROUP/$VOLUME dalloc_enable=0
# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0
# vxtunefs -s -o dalloc_enable=0 /$MOUNT_POINT
# vxtunefs -s -o dalloc_enable=0 /testmnt1
/dev/vx/dsk/$DISKGROUP/$VOLUME dalloc_enable=0
/dev/vx/dsk/testdg/testvol1 dalloc_enable=0
# vxtunefs /testmnt1 | grep "dalloc_enable"dalloc_enable = 0
# vxtunefs /testmnt1 |grep "dalloc_enable"dalloc_enable = 1# vxtunefs -s -o dalloc_enable=0 /testmnt1UX:vxfs vxtunefs: INFO: V-3-22525: Parameters successfully set for /testmnt1# vxtunefs /testmnt1 |grep "dalloc_enable"dalloc_enable = 0
# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0
Example using multiple file systems:
# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 1# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 1# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0/dev/vx/dsk/testdg/testvol2 dalloc_enable=0# umount /testmnt1/# umount /testmnt2/# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0
This is helpful if you need to apply the tuning system-wide. If the system_default is specified instead of block_device, the tunable setting will be applied to all of the block devices when mounting the file systems.
1. Create the /etc/vx/tunefstab if it is not there already.
# touch /etc/vx/tunefstab# ls -l /etc/vx/tunefstab -rw-r--r-- 1 root system 75 Aug 21 17:11 /etc/vx/tunefstab
system_default dalloc_enable=0
# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0
Note: The system_default setting is overridden if a block_device is also explicitly specified.
For example, if we want to change the system-wide default for dalloc_enable to 0, but do not want to set it for /dev/vx/dsk/testdg/testvol3, then we can achieve this as follows:
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0/dev/vx/dsk/testdg/testvol3 dalloc_enable=1# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# mount -F vxfs /dev/vx/dsk/testdg/testvol3 /testmnt3/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt3/ | grep "dalloc_enable"dalloc_enable = 1
Due to a defect, some in-memory data is not flushed to the disk, when the dalloc feature is used
The problem is fixed in the following patch and hotfix releases.
| Platform | Affected VxFS Versions | Required Public GA VxFS patches | Private Hot-fixes |
|---|---|---|---|
| Solaris 10 | 6.0.5.100 | 6.0.5.400 | 6.0.5.105 |
| 6.1 | 6.1.1.400 (to be released) | 6.1.1.103 | |
| 6.2 | 6.2.1 | 6.2.0.101 | |
| Solaris 11 | 6.0.5.100 | 6.0.5.400 | 6.0.5.102 |
| 6.1 | 6.1.1.400 (to be released) | 6.1.1.101 | |
| 6.2 | 6.2.1 | 6.2.0.101 | |
| Linux | 6.0.5.100 | 6.0.5.400 | 6.0.5.206 (RHEL6 only) |
| 6.1 | 6.1.1.400 (to be released) | 6.1.1.009 | |
| 6.2 | 6.2.1 | - | |
| AIX | 6.0.5.100 | 6.0.5.400 | 6.0.5.101 |
| 6.1 | 6.1.1.400 (to be released) 6.2 (fix is provided in VxFS 6.2 on AIX) |
- | |
| HP-UX 11.31 | 6.0.5.100 | 6.0.5.400 | - |
The public patches can be downloaded from the Veritas Operation Readiness Tools (SORT) website.
https://sort.veritas.com/patch/
Note: The Veritas Storage Foundation High Availability (SFHA) patch version can be different from Veritas File System (VRTSvxfs) version. Sometimes, within the SFHA patch set, the actual VxFS patch version is lower than the SFHA version. For example, in the SFHA patch version 6.0.5.300, for the RHEL 6.7 platform, the VxFS patch version is actually 6.0.5.200. Please check the actual version of the included VxFS patch if a SFHA patch set is installed.
/opt/VRTS/bin/fsmap -a <file>Example:
# /opt/VRTS/bin/fsmap -a /avol01/datafile Volume Extent Type File Offset Extent Size File.... avol01 Data 1310720000 8626176 /avol01/datafile avol01 Data 1319346176 24928256 /avol01/datafile - <Hole> 1344274432 131072 /avol01/datafile avol01 Data 1344405504 8626176 /avol01/datafile avol01 Data 1353031680 24928256 /avol01/datafile
# ls -li /avol01/datafile2214 -rw-r--r--. 1 root root 1377959936 Aug 23 16:18 /avol01/datafile
# echo '2214i.mapall' | /opt/VRTS/bin/fsdb /dev/vx/rdsk/adg/avol01offset device block length plength.... 1310720000 0 24344 8424 8424 1319346176 0 32768 24344 24344 1344274432 - HOLE 128 1344405504 0 57112 8424 8424 1353031680 0 65536 24344 24344When an application reads the part of the file that corresponds to the hole in the extent map, VxFS will returns zeros for that part of the file. This is by design.