Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Thursday, March 27, 2014

Restoring OCR and Vote (11gR2 Linux)

Restoring OCR from backup

Testing Scenario
OCR is located on +OCR_VOTE diskgroup which was created with external redundancy, 
OCRVOTE is corrupted or diskgroup where OCRVOTE is located has problem.

Process:
1- Stop cluster on both nodes and delete asmdisk of ocr and vote 

[root@rac1 bin]# ./crsctl stop cluster -all

We will drop ASMDSK5 which is used by +OCR_VOTE to test our scenario.

[root@rac1 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done


[root@rac2 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done


We will format /dev/sdf which was used by ASMDSK5(used for +OCR_VOTE diskgroup) 
so data on it can be deleted (OCRVOTE file on this disk will be deleted)

[root@rac1 sbin]# fdisk /dev/sdf

2- Recreate ASMDSK5 on both nodes

[root@rac1 sbin]# oracleasm createdisk ASMDSK5 /dev/sdf1
[root@rac2 sbin]# oracleasm createdisk ASMDSK5 /dev/sdf1

3- Start cluster in exclusive mode
Upon reboot all cluster processes will not start because it cannot locate and read OCR, so to start 
maintenance we will stop and restart cluster in exclusive mode.

If we will try to stop whole cluster some of services which are already started will not stop and because 
all the processes are not STOPPED, disable the cluster AUTO Start and reboot the server for 
cleaning all the pending processes.

root@rac1 bin]# ./crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.

root@rac1 bin]# reboot

Upon reboot cluster will not be started as we disabled it. Start it in exclusive mode (using root user)

[root@rac1 bin]# ./crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rac1'
CRS-2681: Clean of 'ora.diskmon' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded



NOTE: you could stop the cluster on each node by using below commands
# crsctl stop crs -f
# crsctl start crs -excl -nocrs

‘-nocrs‘ option introduced with 11.2.0.2 prevents the start of the ora.crsd resource. It is vital that this option
is specified; otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip,
which in turn will cause ASM to crash.

4- Create new Diskgroup for ocr and vote

As Oracle User connect to sqlplus
[root@rac1 ~]# su - oracle
[root@rac1 bin]# set oracle_sid=+ASM1
[oracle@rac1 ~]$ ./grid_env
[oracle@rac1 ~]$ sqlplus / as sysasm


SQL*Plus: Release 11.2.0.1.0 Production on Thu Mar 27 00:46:59 2014
Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter system set asm_diskgroups='DATA','FLASH';

System altered.

SQL> create diskgroup OCR_VOTE external redundancy disk '/dev/oracleasm/disks/ASMDSK5' ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';
Diskgroup created.

SQL> show parameter asm

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string      DATA, FLASH, OCR_VOTE
asm_diskstring                       string      /dev/oracleasm/disks
asm_power_limit                      integer     1
asm_preferred_read_failure_groups    string

SQL> shutdown immediate;
ASM diskgroups volume disabled
ASM diskgroups dismounted
ASM instance shutdown

SQL> startup;
ASM instance started

Total System Global Area  284565504 bytes
Fixed Size                  1336036 bytes
Variable Size             258063644 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
DATA                           MOUNTED
FLASH                          MOUNTED
OCR_VOTE                       MOUNTED


5- Restore OCR
First know the location of OCR
$ cat /etc/oracle/ocr.loc 

Locate the latest automatic OCR backup
$GRID_HOME\bin\ocrconfig –showbackup

[root@rac1 bin]# ./ocrconfig -restore /u01/app/11.2.0/grid/cdata/racscan/backup_20140327_002335.ocr

Verify that OCR is restored using ocrcheck

[root@rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2748
         Available space (kbytes) :     259372
         ID                       : 1499687051
         Device/File Name         :  +OCR_VOTE
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

6- Initialize votedisk
Replace votedisk so it can be reinitialized
[root@rac1 bin]# ./crsctl replace votedisk +OCR_VOTE
Successful addition of voting disk 324b6b7134544f73bfb716c42f0f21c1.
Successful deletion of voting disk 0c1f71f3e5184f79bf79b85c77a79658.
Successfully replaced voting disk group with +OCR_VOTE.
CRS-4266: Voting file(s) successfully replaced

[root@rac1 bin]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   324b6b7134544f73bfb716c42f0f21c1 (/dev/oracleasm/disks/ASMDSK5) [OCR_VOTE]
Located 1 voting disk(s).
[root@rac1 bin]#

7- Enable and start CRS

[root@rac1 bin]# ./crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.[root@rac1 bin]# ./crsctl start crs
or reboot both nodes, now cluster should start

NOTE: You could stop the crs on each node and start again as below also
# crsctl stop crs -f
# crsctl start crs

No comments: