Introduction
Using vxtrace to determine I/O characteristics
Sequential I/O
Random I/O
Determining the current stripe unit size
Matching the stripe size to the file system allocation unit size
(Back to top)
By striping data across multiple spindles (physical disks) I/O can be processed in a parallel manner, increasing peformance. However, the traditional advantages of software-based stripe-sets are sometimes outweighed by changes and improvements to modern storage hardware. Today, disk arrays typically provide their own hardware-based striping which should be taken into consideration to avoid implmenting multiple RAID implementations that may conflict with each other. Different applications, such as databases or file servers, have dissimilar I/O characteristics that are affected by striping in varying ways.
In theory, as more spindles are added to a stripe set, more I/O is processed in parallel, potentially improving performance. However, the increase in parallel processing must be weighed against the increasing amount of movement that is the result of fragmenting I/O across multiple columns. As columns are added, one eventually encounters a "diminishing return" where adding further columns no longer provides a significant improvement in I/O, or is not worth the increased risk of a hardware failure. Every spindle that is added to a stripe set increases the chance that a single hardware failure will cause the entire volume to fail.
Note: Do not assume that a larger number of columns will provide better performance than a smaller number, or that a certain stripe unit size will have superior performance when compared to a different stripe unit size, or even that a striped volume will actually have superior performance when compared to a concatenated volume.
There are too many variables involved in performance for such assumptions to be true for all cases and there is no substitute for testing. Before putting a volume into production, use benchmarking tools to test I/O performance, in different layouts, in a manner that is representative of the intended production environment. This is the only reliable method to determine which layout provides the best performance.
(Back to top)
Vxtrace can be used to analyze the characteristics of I/O that is being written to a volume (Figure 1). This is useful for distinguishing random I/O from sequential I/O, the typical length (in sectors) of each I/O transaction, and how the I/O is being fragmented across multiple columns. The optimal stripe unit size ultimately depends on the characteristics of the I/O that is generated by the application.
Finding the typical I/O length is important for determining an appropriate stripe unit size.
Note: The vxtrace excerpts in this article are very brief to improve readability. Reviewing a larger sample is recommended in order to include data that is representative of the production environment.
Figure 1 - Using vxtrace to gather information about I/O to a volume
Syntax: vxtrace -t Example, with typical output: # vxtrace -t 10 -g datadg -o dev,disk engvol > /tmp/vxtrace.engvol |
(Back to top)
Figures 2 shows an example of sequential I/O, as observed by vxtrace. Notice that the starting block for each I/O appears to increment slightly from the previous operation. Also notice that the I/O length is usually 384 sectors.
For sequential I/O, optimal performance is generally achieved if I/O transactions are more frequently spread across multiple columns. This can be accomplished by using a stripe width size that is smaller than the typical I/O length.
Figure 2 - An example of vxtrace output showing sequential I/O
| 53595 START write vdev vol1 block 5785984 len 384 concurrency 1 pid 5855 53596 START write disk disk_5 op 53598 block 1994368 len 128 53597 START write disk disk_3 op 53598 block 1994496 len 128 53598 START write disk disk_4 op 53598 block 1994496 len 128 53595 END write vdev vol1 block 5785984 len 384 53596 END write disk disk_5 op 53598 block 1994368 len 128 53597 END write disk disk_3 op 53598 block 1994496 len 128 53598 END write disk disk_4 op 53598 block 1994496 len 128 53603 START write vdev vol1 block 5786752 len 384 concurrency 1 pid 5855 53604 START write disk disk_5 op 53606 block 1994624 len 128 53605 START write disk disk_3 op 53606 block 1994752 len 128 53606 START write disk disk_4 op 53606 block 1994752 len 128 53603 END write vdev vol1 block 5786368 len 384 53604 END write disk disk_5 op 53602 block 1994496 len 128 53605 END write disk disk_3 op 53602 block 1994624 len 128 53606 END write disk disk_4 op 53602 block 1994624 len 128 53611 START write vdev vol1 block 5786752 len 384 concurrency 1 pid 5855 53612 START write disk disk_5 op 53606 block 1994624 len 128 53613 START write disk disk_3 op 53606 block 1994752 len 128 53614 START write disk disk_4 op 53606 block 1994752 len 128 53615 START write vdev vol1 block 5787136 len 64 concurrency 2 pid 5855 53616 START write disk disk_5 op 53610 block 1994752 len 64 |
(Back to top)
Figure 3 shows an example of random I/O. Notice that the starting block varies significantly. The I/O lengths also vary in this sample, but tend to be lower than those in Figure 2.
For random I/O, optimal performance is generally achieved by containing each I/O transaction into a single column. To accomplish this, the stripe unit size should be larger than the average I/O size.
Figure 3 - An example of vxtrace output showing random I/O
| 43024 START write vdev vol1 block 33778 len 94 concurrency 1 pid 2202 43025 START write disk disk_5 op 43024 block 77042 len 14 43026 START write disk disk_3 op 43024 block 77056 len 80 43025 END write disk disk_5 op 43024 block 77042 len 14 time 3 43026 END write disk disk_3 op 43024 block 77056 len 80 time 3 43024 END write vdev vol1 op 43024 block 33778 len 94 time 3 43027 START write vdev vol1 block 1104 len 1 concurrency 1 pid 2203 43028 START write disk disk_5 op 43027 block 66128 len 1 43028 END write disk disk_5 op 43027 block 66128 len 1 time 2 43027 END write vdev vol1 op 43027 block 1104 len 1 time 2 43028 START write vdev vol1 block 1631 len 59 concurrency 1 pid 2202 43029 START write disk disk_3 op 43037 block 66399 len 33 43030 START write disk disk_4 op 43037 block 66304 len 26 43029 END write disk disk_3 op 43037 block 66399 len 33 time 3 43030 END write disk disk_4 op 43037 block 66304 len 26 time 3 43028 END write vdev vol1 op 43037 block 1631 len 59 time 3 43040 START write vdev vol1 block 36080 len 16 concurrency 1 pid 2203 |
(Back to top)
Use vxprint to determine the current stripe unit size (Figure 4).
Figure 4 shows volume "mgmtvol" with the following characteristics:
Figure 4
Syntax: vxprint -htv Example, with typical output: # vxprint -htv mgmtvol |
(Back to top)
A best practice is to set the stripe width to a multiple of the filesystem allocation unit size, For example, if the filesystem block size is 4KB, a stripe width of 384 would be a valid multiple because the quotient of 384 and 4 is an integer. Recall that the stripe width is the product of the stripe unit size multiplied by the number of columns.
Use fstyp to determine the filesystem block size (Figure 5).
Figure 5
Syntax: fstyp -t|F vxfs -v Example, with typical output: # fstyp -t vxfs -v /dev/vx/rdsk/datadg/mgmtvol |
| This article is a part of a set on troubleshooting volume performance. Click here to start at the beginning: https://www.veritas.com/docs/000087750 |