Performing Dynamic LUN Resizing with vxdisk resize command can cause data corruption on RHEL 8.5

book

Article ID: 100053220

calendar_today

Updated On:

Description

Error Message

The following is reported in the Oracle alert log:

Corrupt block seq: 6996 blocknum=28.
Bad header found during reread
Data in bad block - seq:64. bno:0. time:257
beg:0 cks:0
calculated check value: 22617

2022-04-29T23:20:07.326575+02:00
ALTER SYSTEM ARCHIVE LOG
2022-04-29T23:20:07.337007+02:00
Thread 1 cannot allocate new log, sequence 7247
Checkpoint not complete
Current log# 1 seq# 7246 mem# 0: /xxxx/u50/oradata/redoT1_G1_M1.log
Current log# 1 seq# 7246 mem# 1: /xxxx/u51/oradata/redoT1_G1_M2.log
2022-04-29T23:20:08.623999+02:00
Thread 1 advanced to log sequence 7247 (LGWR switch), current SCN: 16621384
Current log# 2 seq# 7247 mem# 0: /xxxx/u50/oradata/redoT1_G2_M1.log
Current log# 2 seq# 7247 mem# 1: /xxxx/u51/oradata/redoT1_G2_M2.log
2022-04-29T23:20:08.648023+02:00
NET (PID:579889): Archived Log entry 7244 added for T-1.S-7246 ID 0x6165152a LAD:1
2022-04-29T23:20:10.374402+02:00

***
Corrupt block seq: 6996 blocknum=28.
Bad header found during backing up archived log
Data in bad block - seq:0. bno:0. time:0
beg:0 cks:0
calculated check value: 0

Reread of seq=6996, blocknum=28, file=/xxxx/u99/archlog/arch_DM2P_1_6996_1100604970.log, found different data

Rman backups also fail:

RMAN-03002: failure of backup plus archivelog command at 05/02/2022 15:09:53
RMAN-06059 expected archived log not found loss of archived log compromises recoverability
ORA-19625; error identifying file /xxxx/U99/ARCHLOG/ARCH-xxxx_8068_1100604970.log
ora-27048: skgfifi: file header information is invalid
Additional information: 2

Errors similar to the following are written in the messages log:

Apr 28 21:44:40 chz1lhdbcnp01 kernel: handle_bad_sector: 22 callbacks suppressed
Apr 28 21:44:40 chz1lhdbcnp01 kernel: attempt to access beyond end of device#012VxDMP17: rw=0, want=109503553, limit=106954752
Apr 28 21:44:40 chz1lhdbcnp01 kernel: attempt to access beyond end of device#012VxDMP17: rw=0, want=109503809, limit=106954752
Apr 28 21:44:40 chz1lhdbcnp01 kernel: attempt to access beyond end of device#012VxDMP17: rw=0, want=109504065, limit=106954752
Apr 28 21:44:40 chz1lhdbcnp01 kernel: attempt to access beyond end of device#012VxDMP17: rw=0, want=109504321, limit=106954752

Cause

After the disk resize, the OS failed to re-read the partition table but utilizes stale partition metadata. fdisk shows incorrect disk capacity on the first line of output, and correct size of disk geometry. ie:

fdisk -l

# fdisk -l

Disk /dev/sdb: 25 GiB, 26843545600 bytes, 52428800 sectors
Geometry: 16 heads, 81 sectors/track, 64724 cylinders
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 4194304 bytes
Disklabel type: sun

Device     Start      End  Sectors Size Id Type       Flags
/dev/sdb3      0 83882303 83882304  40G  5 Whole disk    u
/dev/sdb8      0 83882303 83882304  40G  f unknown       u

Resolution

 Redhat has acknowledged this is a regression in RHEL 8.5 and is fixed in RHEL 8.6. The following technote refers:

https://access.redhat.com/solutions/6959453

The following private RHBZ also refers:

 2088479 - Change in disk capacity change trigger in RHEL8.5 

 

Issue/Introduction

Corruption is reported in an Oracle Database after the file system has been resized using vxdisk resize and vxresize.