The Problem
I recently ran into an issue when performing a RMAN duplicate on a server with a fresh installation of OEL 6.4 x86-64 using a brand new SAN (NetApp Data ONTAP 8.2). The database version is Enterprise Edition 11.2.0.3.
The datafiles restored without an issue but when we got to those archived redo logs I hit an ORA-00600. Specifically it was this error.
ORA-00600: internal error code, arguments: [kfk_verify_io9]
At first I chalked it up as a fluke, so I tried the RMAN duplicate again. Well the datafiles restored just fine but once again when I got to the archived redo logs I hit the ORA-00600 error. A quick trip to MOS and I find 4 documents that somewhat point to a similar issue but not definitive. I open up an SR with Oracle Support to get them looking into the issue while I continue to troubleshoot. Those previous articles mention 4K sector size disk around the same time Oracle Support comes back asking about the same. Let’s see what we find…
Observations
The logical and physical size of the disks that are involved with that particular disk group shows the following:
$ cat /sys/block/dm-19/queue/logical_block_size 512 $cat /sys/block/dm-19/queue/physical_block_size 4096
Checking the ASM disk shows us a 4K sector size.
SQL> select name, sector_size from v$asm_disk; NAME SECTOR_SIZE ------------------------------ ----------- DATA01 4096
While I’m waiting for Oracle Support to come back with information I find MOS Doc # 1500460.1. After reading the document, I find 2 possible workaround/solutions that will work for us.
1) We can set the ASMLIB config file to use the logical block size of the disk instead of the physical block size.
$ /usr/sbin/oracleasm configure -b
or
2) On the SAN itself we issue the NetApp command to disable reporting the physical block size of the LUN.
> lun set report-physical-size <path> disable
A short while later Oracle Support comes back with information that there are some known bugs with the 4K sector size disks when using both ASM and Oracle 11.2.0.3. Their suggestion was a straight forward “make sure the disk is both logically and physically presented as 512 bytes”.
Solution
We go with option 1 and make the change in the AMSLIB config file (located in /etc/sysconfig/oracleasm). Let’s do that now.
1) Shutdown ASM
sqlplus "/ as sysasm" SQL> shutdown normal; ASM diskgroups dismounted ASM instance shutdown SQL>
2) Change the ASMLIB config file to only look at the logical block size.
$ sudo /usr/sbin/oracleasm configure -b Writing Oracle ASM library driver configuration: done $ cat /etc/sysconfig/oracleasm # # This is a configuration file for automatic loading of the Oracle # Automatic Storage Management library kernel driver. It is generated # By running /etc/init.d/oracleasm configure. Please use that method # to modify this file # # ORACLEASM_ENABLED: 'true' means to load the driver on boot. ORACLEASM_ENABLED=true # ORACLEASM_UID: Default user owning the /dev/oracleasm mount point. ORACLEASM_UID=oragrid # ORACLEASM_GID: Default group owning the /dev/oracleasm mount point. ORACLEASM_GID=asmdba # ORACLEASM_SCANBOOT: 'true' means scan for ASM disks on boot. ORACLEASM_SCANBOOT=true # ORACLEASM_SCANORDER: Matching patterns to order disk scanning ORACLEASM_SCANORDER="mpath dm" # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan ORACLEASM_SCANEXCLUDE="sd" # ORACLEASM_USE_LOGICAL_BLOCK_SIZE: 'true' means use the logical block size # reported by the underlying disk instead of the physical. The default # is 'false' ORACLEASM_USE_LOGICAL_BLOCK_SIZE=true
3) Let’s restart ASMLib.
$ sudo /etc/init.d/oracleasm restart Dropping Oracle ASMLib disks: [ OK ] Shutting down the Oracle ASMLib driver: [ OK ] Initializing the Oracle ASMLib driver: [ OK ] Scanning the system for Oracle ASMLib disks: [ OK ]
4) We’ll need to recreate the disk since they will retain the 4K sector size in the disk header.
$ sudo oracleasm deletedisk DATA01 Clearing disk header: done Dropping disk: done sudo oracleasm createdisk DATA01 /dev/mapper/mpathdp1 Writing disk header: done Instantiating disk: done
5) Let’s start up the ASM instance and create the diskgroup again.
SQL> startup ASM instance started Total System Global Area 71303168 bytes Fixed Size 1069292 bytes Variable Size 45068052 bytes ASM Cache 25165824 bytes ASM disk groups mounted SQL> select name, sector_size from v$asm_disk; NAME SECTOR_SIZE ------------------------------ ----------- DATA01 512 SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY DISK 'ORCL:DATA01' NAME DATA01, ATTRIBUTE 'au_size'='4M', 'compatible.asm' = '11.2', 'compatible.rdbms' = '11.2'; Diskgroup created. SQL> select name, sector_size from v$asm_diskgroup; NAME SECTOR_SIZE ------------------------------ ----------- DATA 512
Issuing a KFED on the disk shows us the sector size is also 512 bytes
$ kfed read DATA01 kfdhdb.secsize: 512 ; 0x0b8: 0x0200
After taking care of all these steps and getting the ASM disks/diskgroups at a 512 byte sector size, our RMAN duplicate is 100% successful!