03 August, 2014

GI Commands : 2 -- Managing the Local and Cluster Registry

In 11gR2

There are essentially two registries, the Local Registry and the Cluster Registry.

Let's check the Local Registry :

[root@node1 ~]# cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/grid/11.2.0/cdata/node1.olr
crs_home=/u01/app/grid/11.2.0
[root@node1 ~]# 
[root@node1 ~]# file /u01/app/grid/11.2.0/cdata/node1.olr
/u01/app/grid/11.2.0/cdata/node1.olr: data
[root@node1 ~]# 

So, like the Cluster Registry, the Local Registry is a binary file.  It is on a local filesystem on the node, not on ASM/NFS/CFS.  Each node in the cluster has its own Local Registry.

The Local Registry can be checked for consistency (corruption) using ocrcheck with the "-local" flag.  Note : As demonstrated in my previous post, the root account must be used for the check.

[root@node1 ~]# su - grid
-sh-3.2$ su 
Password: 
[root@node1 grid]# ocrcheck -local
Status of Oracle Local Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2696
         Available space (kbytes) :     259424
         ID                       : 1388021147
         Device/File Name         : /u01/app/grid/11.2.0/cdata/node1.olr
                                    Device/File integrity check succeeded

         Local registry integrity check succeeded

         Logical corruption check succeeded

[root@node1 grid]# 

Now let's look at the Cluster Registry :

[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3668
         Available space (kbytes) :     258452
         ID                       :  605940771
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
         Device/File Name         : /fra/ocrfile
                                    Device/File integrity check succeeded
         Device/File Name         :       +FRA
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@node1 grid]# 

The Cluster Registry is distributed across two ASM DiskGroups (+DATA and +FRA) and one filesystem (/fra/ocrfile). Yes, this is a special case that I've created to distribute the OCR in this manner.

I cannot add the OCR to a location which is an ASM diskgroup with a lower asm.compatible.

[root@node1 grid]# ocrconfig -add +DATA2
PROT-30: The Oracle Cluster Registry location to be added is not accessible
PROC-8: Cannot perform cluster registry operation because one of the parameters is invalid.
ORA-15056: additional error message
ORA-17502: ksfdcre:4 Failed to create file +DATA2.255.1
ORA-15221: ASM operation requires compatible.asm of 11.1.0.0.0 or higher
ORA-06512: at line 4

[root@node1 grid]# 

I now remove the filesystem copy of the OCR.

[root@node1 grid]# ocrconfig -delete /fra/ocrfile
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3668
         Available space (kbytes) :     258452
         ID                       :  605940771
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
         Device/File Name         :       +FRA
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@node1 grid]# 

Note, however, that the ocrconfig delete doesn't actually remove the filesystem file that I had created.

[root@node1 grid]# ls -l /fra/ocrfile
-rw-r--r-- 1 root root 272756736 Aug  3 21:27 /fra/ocrfile
[root@node1 grid]# rm /fra/ocrfile
rm: remove regular file `/fra/ocrfile'? yes
[root@node1 grid]# 

I will now add a filesystem location for the OCR.

[root@node1 grid]# touch /fra/new_ocrfile
[root@node1 grid]# ocrconfig -add /fra/new_ocrfile
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3668
         Available space (kbytes) :     258452
         ID                       :  605940771
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
         Device/File Name         :       +FRA
                                    Device/File integrity check succeeded
         Device/File Name         : /fra/new_ocrfile
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@node1 grid]# ls -l /fra/new_ocrfile
-rw-r--r-- 1 root root 272756736 Aug  3 21:30 /fra/new_ocrfile
[root@node1 grid]# 

What about OCR Backups ?  (Note : Oracle does frequent automatic backups of the OCR, but *not* of the OLR).
N.B. : This listing doesn't show all the OCR backups you'd expect because I don't have my cluster running continuously through all the days.

[root@node1 grid]# ocrconfig -showbackup

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/backup00.ocr

node1     2011/10/22 03:09:03     /u01/app/grid/11.2.0/cdata/rac/backup01.ocr

node1     2011/10/21 23:06:39     /u01/app/grid/11.2.0/cdata/rac/backup02.ocr

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/day.ocr

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/week.ocr

node1     2014/07/06 22:39:55     /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1     2014/07/05 17:30:25     /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1     2014/06/16 22:15:07     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1     2014/06/16 22:14:05     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr

node1     2011/11/09 23:20:25     /u01/app/grid/11.2.0/cdata/rac/backup_20111109_232025.ocr
[root@node1 grid]# 

Let me run an additional backup from node2.

[root@node2 grid]# ocrconfig -manualbackup

node1     2014/08/03 21:37:17     /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1     2014/07/06 22:39:55     /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1     2014/07/05 17:30:25     /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1     2014/06/16 22:15:07     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1     2014/06/16 22:14:05     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr
[root@node2 grid]# 

We can see that the backup done today (03-Aug) is listed at the top.  Let's check a listing from node1

[root@node1 grid]# ocrconfig -showbackup

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/backup00.ocr

node1     2011/10/22 03:09:03     /u01/app/grid/11.2.0/cdata/rac/backup01.ocr

node1     2011/10/21 23:06:39     /u01/app/grid/11.2.0/cdata/rac/backup02.ocr

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/day.ocr

node1     2014/07/06 21:53:25     /u01/app/grid/11.2.0/cdata/rac/week.ocr

node1     2014/08/03 21:37:17     /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1     2014/07/06 22:39:55     /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1     2014/07/05 17:30:25     /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1     2014/06/16 22:15:07     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1     2014/06/16 22:14:05     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr
[root@node1 grid]# 

Yes, the backup of 03-Aug is also listed.  But, wait ! Why is it on node1 ?  Let's go back to node2 and do a filesytem listing.

[root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup*
ls: /u01/app/grid/11.2.0/cdata/rac/backup*: No such file or directory
[root@node2 grid]#

Yes, as we've noticed. The backup doesn't really exist on node2.

[root@node1 grid]# ls -lt /u01/app/grid/11.2.0/cdata/rac/
total 114316
-rw------- 1 root root 8024064 Aug  3 21:37 backup_20140803_213717.ocr
-rw------- 1 root root 8003584 Jul  6 22:39 backup_20140706_223955.ocr
-rw------- 1 root root 8003584 Jul  6 21:53 day.ocr
-rw------- 1 root root 8003584 Jul  6 21:53 week.ocr
-rw------- 1 root root 8003584 Jul  6 21:53 backup00.ocr
-rw------- 1 root root 8003584 Jul  5 17:30 backup_20140705_173025.ocr
-rw------- 1 root root 7708672 Jun 16 22:15 backup_20140616_221507.ocr
-rw------- 1 root root 7708672 Jun 16 22:14 backup_20140616_221405.ocr
-rw------- 1 root root 7688192 Nov  9  2011 backup_20111109_232025.ocr
-rw------- 1 root root 7667712 Nov  9  2011 backup_20111109_230940.ocr
-rw------- 1 root root 7647232 Nov  9  2011 backup_20111109_230916.ocr
-rw------- 1 root root 7626752 Nov  9  2011 backup_20111109_224725.ocr
-rw------- 1 root root 7598080 Nov  9  2011 backup_20111109_222941.ocr
-rw------- 1 root root 7593984 Oct 22  2011 backup01.ocr
-rw------- 1 root root 7593984 Oct 21  2011 backup02.ocr
[root@node1 grid]# 

Yes, *ALL* the OCR backups to date have been created on node1 -- even when executed from node2.  node1 is still the "master" node for OCR backups as long as it is up and running.  I shut down Grid Infrastructure on node1.

[root@node1 grid]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.crsd' on 'node1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.racdb.new_svc.svc' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.cvu' on 'node1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'node1'
CRS-2673: Attempting to stop 'ora.gns' on 'node1'
CRS-2677: Stop of 'ora.cvu' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'node2'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'node1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node1' succeeded
CRS-2677: Stop of 'ora.scan3.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.scan3.vip' on 'node2'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'node1'
CRS-2677: Stop of 'ora.scan2.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'node2'
CRS-2676: Start of 'ora.cvu' on 'node2' succeeded
CRS-2677: Stop of 'ora.racdb.new_svc.svc' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.node1.vip' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'node1'
CRS-2673: Attempting to stop 'ora.racdb.db' on 'node1'
CRS-2677: Stop of 'ora.node1.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.node1.vip' on 'node2'
CRS-2676: Start of 'ora.scan3.vip' on 'node2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'node2'
CRS-2676: Start of 'ora.scan2.vip' on 'node2' succeeded
CRS-2677: Stop of 'ora.registry.acfs' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'node2'
CRS-2677: Stop of 'ora.gns' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gns.vip' on 'node1'
CRS-2677: Stop of 'ora.gns.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.gns.vip' on 'node2'
CRS-2676: Start of 'ora.node1.vip' on 'node2' succeeded
CRS-2676: Start of 'ora.gns.vip' on 'node2' succeeded
CRS-2672: Attempting to start 'ora.gns' on 'node2'
CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'node2' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'node2' succeeded
CRS-2677: Stop of 'ora.racdb.db' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA2.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'node1'
CRS-2676: Start of 'ora.gns' on 'node2' succeeded
CRS-2677: Stop of 'ora.DATA1.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.DATA2.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'node2'
CRS-2676: Start of 'ora.oc4j' on 'node2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'node1'
CRS-2677: Stop of 'ora.ons' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'node1'
CRS-2677: Stop of 'ora.net1.network' on 'node1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed
CRS-2677: Stop of 'ora.crsd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node1'
CRS-2673: Attempting to stop 'ora.crf' on 'node1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'node1'
CRS-2673: Attempting to stop 'ora.evmd' on 'node1'
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node1'
CRS-2677: Stop of 'ora.crf' on 'node1' succeeded
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node1'
CRS-2677: Stop of 'ora.evmd' on 'node1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'node1' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'node1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'node1' succeeded
CRS-2677: Stop of 'ora.cssd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'node1'
CRS-2673: Attempting to stop 'ora.diskmon' on 'node1'
CRS-2677: Stop of 'ora.gipcd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1'
CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded
CRS-2677: Stop of 'ora.diskmon' on 'node1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@node1 grid]# 

So, all the Grid Infrastructure services are down on node1. I will run an OCR Backup from node2 and verify it's location.

[root@node2 grid]# ocrconfig -manualbackup

node2     2014/08/03 21:49:02     /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr

node1     2014/08/03 21:37:17     /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1     2014/07/06 22:39:55     /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1     2014/07/05 17:30:25     /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1     2014/06/16 22:15:07     /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr
[root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup*
-rw------- 1 root root 8024064 Aug  3 21:49 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr
[root@node2 grid]# 

Yes, the backup got created on node2 now.

Question : Would there have been a way to create a backup on node2 without shutting down node1 ?

.
.
.

3 comments:

Maaz Anjum said...

Hemant,

This is a great post. On several occasions while teaching customers about RAC, specifically GI, I'm asked how the backups for OCR are setup. Once I explain how it works, the next question usually about the master node.

Your post shows how a new "master" node is automatically configured when the current one is unavailable - something I will point my future students to for sure!

Cheers,
Maaz

Anonymous said...

ocrconfig -export ?
or evict master node manually and rejoin again ?

-Nitin

EXADATASUPPORT said...

Great post .. Thank you

Krishna