In 11gR2
There are essentially two registries, the Local Registry and the Cluster Registry.
Let's check the Local Registry :
So, like the Cluster Registry, the Local Registry is a binary file. It is on a local filesystem on the node, not on ASM/NFS/CFS. Each node in the cluster has its own Local Registry.
The Local Registry can be checked for consistency (corruption) using ocrcheck with the "-local" flag. Note : As demonstrated in my previous post, the root account must be used for the check.
Now let's look at the Cluster Registry :
The Cluster Registry is distributed across two ASM DiskGroups (+DATA and +FRA) and one filesystem (/fra/ocrfile). Yes, this is a special case that I've created to distribute the OCR in this manner.
I cannot add the OCR to a location which is an ASM diskgroup with a lower asm.compatible.
I now remove the filesystem copy of the OCR.
Note, however, that the ocrconfig delete doesn't actually remove the filesystem file that I had created.
I will now add a filesystem location for the OCR.
What about OCR Backups ? (Note : Oracle does frequent automatic backups of the OCR, but *not* of the OLR).
N.B. : This listing doesn't show all the OCR backups you'd expect because I don't have my cluster running continuously through all the days.
Let me run an additional backup from node2.
We can see that the backup done today (03-Aug) is listed at the top. Let's check a listing from node1
Yes, the backup of 03-Aug is also listed. But, wait ! Why is it on node1 ? Let's go back to node2 and do a filesytem listing.
Yes, as we've noticed. The backup doesn't really exist on node2.
Yes, *ALL* the OCR backups to date have been created on node1 -- even when executed from node2. node1 is still the "master" node for OCR backups as long as it is up and running. I shut down Grid Infrastructure on node1.
So, all the Grid Infrastructure services are down on node1. I will run an OCR Backup from node2 and verify it's location.
Yes, the backup got created on node2 now.
Question : Would there have been a way to create a backup on node2 without shutting down node1 ?
.
.
.
There are essentially two registries, the Local Registry and the Cluster Registry.
Let's check the Local Registry :
[root@node1 ~]# cat /etc/oracle/olr.loc olrconfig_loc=/u01/app/grid/11.2.0/cdata/node1.olr crs_home=/u01/app/grid/11.2.0 [root@node1 ~]# [root@node1 ~]# file /u01/app/grid/11.2.0/cdata/node1.olr /u01/app/grid/11.2.0/cdata/node1.olr: data [root@node1 ~]#
So, like the Cluster Registry, the Local Registry is a binary file. It is on a local filesystem on the node, not on ASM/NFS/CFS. Each node in the cluster has its own Local Registry.
The Local Registry can be checked for consistency (corruption) using ocrcheck with the "-local" flag. Note : As demonstrated in my previous post, the root account must be used for the check.
[root@node1 ~]# su - grid -sh-3.2$ su Password: [root@node1 grid]# ocrcheck -local Status of Oracle Local Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2696 Available space (kbytes) : 259424 ID : 1388021147 Device/File Name : /u01/app/grid/11.2.0/cdata/node1.olr Device/File integrity check succeeded Local registry integrity check succeeded Logical corruption check succeeded [root@node1 grid]#
Now let's look at the Cluster Registry :
[root@node1 grid]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3668 Available space (kbytes) : 258452 ID : 605940771 Device/File Name : +DATA Device/File integrity check succeeded Device/File Name : /fra/ocrfile Device/File integrity check succeeded Device/File Name : +FRA Device/File integrity check succeeded Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@node1 grid]#
The Cluster Registry is distributed across two ASM DiskGroups (+DATA and +FRA) and one filesystem (/fra/ocrfile). Yes, this is a special case that I've created to distribute the OCR in this manner.
I cannot add the OCR to a location which is an ASM diskgroup with a lower asm.compatible.
[root@node1 grid]# ocrconfig -add +DATA2 PROT-30: The Oracle Cluster Registry location to be added is not accessible PROC-8: Cannot perform cluster registry operation because one of the parameters is invalid. ORA-15056: additional error message ORA-17502: ksfdcre:4 Failed to create file +DATA2.255.1 ORA-15221: ASM operation requires compatible.asm of 11.1.0.0.0 or higher ORA-06512: at line 4 [root@node1 grid]#
I now remove the filesystem copy of the OCR.
[root@node1 grid]# ocrconfig -delete /fra/ocrfile [root@node1 grid]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3668 Available space (kbytes) : 258452 ID : 605940771 Device/File Name : +DATA Device/File integrity check succeeded Device/File Name : +FRA Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@node1 grid]#
Note, however, that the ocrconfig delete doesn't actually remove the filesystem file that I had created.
[root@node1 grid]# ls -l /fra/ocrfile -rw-r--r-- 1 root root 272756736 Aug 3 21:27 /fra/ocrfile [root@node1 grid]# rm /fra/ocrfile rm: remove regular file `/fra/ocrfile'? yes [root@node1 grid]#
I will now add a filesystem location for the OCR.
[root@node1 grid]# touch /fra/new_ocrfile [root@node1 grid]# ocrconfig -add /fra/new_ocrfile [root@node1 grid]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3668 Available space (kbytes) : 258452 ID : 605940771 Device/File Name : +DATA Device/File integrity check succeeded Device/File Name : +FRA Device/File integrity check succeeded Device/File Name : /fra/new_ocrfile Device/File integrity check succeeded Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@node1 grid]# ls -l /fra/new_ocrfile -rw-r--r-- 1 root root 272756736 Aug 3 21:30 /fra/new_ocrfile [root@node1 grid]#
What about OCR Backups ? (Note : Oracle does frequent automatic backups of the OCR, but *not* of the OLR).
N.B. : This listing doesn't show all the OCR backups you'd expect because I don't have my cluster running continuously through all the days.
[root@node1 grid]# ocrconfig -showbackup node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/backup00.ocr node1 2011/10/22 03:09:03 /u01/app/grid/11.2.0/cdata/rac/backup01.ocr node1 2011/10/21 23:06:39 /u01/app/grid/11.2.0/cdata/rac/backup02.ocr node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/day.ocr node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/week.ocr node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr node1 2011/11/09 23:20:25 /u01/app/grid/11.2.0/cdata/rac/backup_20111109_232025.ocr [root@node1 grid]#
Let me run an additional backup from node2.
[root@node2 grid]# ocrconfig -manualbackup node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr [root@node2 grid]#
We can see that the backup done today (03-Aug) is listed at the top. Let's check a listing from node1
[root@node1 grid]# ocrconfig -showbackup node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/backup00.ocr node1 2011/10/22 03:09:03 /u01/app/grid/11.2.0/cdata/rac/backup01.ocr node1 2011/10/21 23:06:39 /u01/app/grid/11.2.0/cdata/rac/backup02.ocr node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/day.ocr node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/week.ocr node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr [root@node1 grid]#
Yes, the backup of 03-Aug is also listed. But, wait ! Why is it on node1 ? Let's go back to node2 and do a filesytem listing.
[root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup* ls: /u01/app/grid/11.2.0/cdata/rac/backup*: No such file or directory [root@node2 grid]#
Yes, as we've noticed. The backup doesn't really exist on node2.
[root@node1 grid]# ls -lt /u01/app/grid/11.2.0/cdata/rac/ total 114316 -rw------- 1 root root 8024064 Aug 3 21:37 backup_20140803_213717.ocr -rw------- 1 root root 8003584 Jul 6 22:39 backup_20140706_223955.ocr -rw------- 1 root root 8003584 Jul 6 21:53 day.ocr -rw------- 1 root root 8003584 Jul 6 21:53 week.ocr -rw------- 1 root root 8003584 Jul 6 21:53 backup00.ocr -rw------- 1 root root 8003584 Jul 5 17:30 backup_20140705_173025.ocr -rw------- 1 root root 7708672 Jun 16 22:15 backup_20140616_221507.ocr -rw------- 1 root root 7708672 Jun 16 22:14 backup_20140616_221405.ocr -rw------- 1 root root 7688192 Nov 9 2011 backup_20111109_232025.ocr -rw------- 1 root root 7667712 Nov 9 2011 backup_20111109_230940.ocr -rw------- 1 root root 7647232 Nov 9 2011 backup_20111109_230916.ocr -rw------- 1 root root 7626752 Nov 9 2011 backup_20111109_224725.ocr -rw------- 1 root root 7598080 Nov 9 2011 backup_20111109_222941.ocr -rw------- 1 root root 7593984 Oct 22 2011 backup01.ocr -rw------- 1 root root 7593984 Oct 21 2011 backup02.ocr [root@node1 grid]#
Yes, *ALL* the OCR backups to date have been created on node1 -- even when executed from node2. node1 is still the "master" node for OCR backups as long as it is up and running. I shut down Grid Infrastructure on node1.
[root@node1 grid]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1' CRS-2673: Attempting to stop 'ora.crsd' on 'node1' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node1' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'node1' CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node1' CRS-2673: Attempting to stop 'ora.racdb.new_svc.svc' on 'node1' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'node1' CRS-2673: Attempting to stop 'ora.cvu' on 'node1' CRS-2673: Attempting to stop 'ora.oc4j' on 'node1' CRS-2673: Attempting to stop 'ora.gns' on 'node1' CRS-2677: Stop of 'ora.cvu' on 'node1' succeeded CRS-2672: Attempting to start 'ora.cvu' on 'node2' CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.scan3.vip' on 'node1' CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node1' succeeded CRS-2677: Stop of 'ora.scan3.vip' on 'node1' succeeded CRS-2672: Attempting to start 'ora.scan3.vip' on 'node2' CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.scan2.vip' on 'node1' CRS-2677: Stop of 'ora.scan2.vip' on 'node1' succeeded CRS-2672: Attempting to start 'ora.scan2.vip' on 'node2' CRS-2676: Start of 'ora.cvu' on 'node2' succeeded CRS-2677: Stop of 'ora.racdb.new_svc.svc' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.node1.vip' on 'node1' CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node1' CRS-2673: Attempting to stop 'ora.registry.acfs' on 'node1' CRS-2673: Attempting to stop 'ora.racdb.db' on 'node1' CRS-2677: Stop of 'ora.node1.vip' on 'node1' succeeded CRS-2672: Attempting to start 'ora.node1.vip' on 'node2' CRS-2676: Start of 'ora.scan3.vip' on 'node2' succeeded CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'node2' CRS-2676: Start of 'ora.scan2.vip' on 'node2' succeeded CRS-2677: Stop of 'ora.registry.acfs' on 'node1' succeeded CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'node2' CRS-2677: Stop of 'ora.gns' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.gns.vip' on 'node1' CRS-2677: Stop of 'ora.gns.vip' on 'node1' succeeded CRS-2672: Attempting to start 'ora.gns.vip' on 'node2' CRS-2676: Start of 'ora.node1.vip' on 'node2' succeeded CRS-2676: Start of 'ora.gns.vip' on 'node2' succeeded CRS-2672: Attempting to start 'ora.gns' on 'node2' CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'node2' succeeded CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'node2' succeeded CRS-2677: Stop of 'ora.racdb.db' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'node1' CRS-2673: Attempting to stop 'ora.DATA2.dg' on 'node1' CRS-2673: Attempting to stop 'ora.FRA.dg' on 'node1' CRS-2676: Start of 'ora.gns' on 'node2' succeeded CRS-2677: Stop of 'ora.DATA1.dg' on 'node1' succeeded CRS-2677: Stop of 'ora.DATA2.dg' on 'node1' succeeded CRS-2677: Stop of 'ora.oc4j' on 'node1' succeeded CRS-2672: Attempting to start 'ora.oc4j' on 'node2' CRS-2676: Start of 'ora.oc4j' on 'node2' succeeded CRS-2677: Stop of 'ora.DATA.dg' on 'node1' succeeded CRS-2677: Stop of 'ora.FRA.dg' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'node1' CRS-2677: Stop of 'ora.asm' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.ons' on 'node1' CRS-2677: Stop of 'ora.ons' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.net1.network' on 'node1' CRS-2677: Stop of 'ora.net1.network' on 'node1' succeeded CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed CRS-2677: Stop of 'ora.crsd' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'node1' CRS-2673: Attempting to stop 'ora.crf' on 'node1' CRS-2673: Attempting to stop 'ora.ctssd' on 'node1' CRS-2673: Attempting to stop 'ora.evmd' on 'node1' CRS-2673: Attempting to stop 'ora.asm' on 'node1' CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node1' CRS-2677: Stop of 'ora.crf' on 'node1' succeeded CRS-2677: Stop of 'ora.asm' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node1' CRS-2677: Stop of 'ora.evmd' on 'node1' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'node1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node1' succeeded CRS-2677: Stop of 'ora.ctssd' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'node1' CRS-2677: Stop of 'ora.drivers.acfs' on 'node1' succeeded CRS-2677: Stop of 'ora.cssd' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'node1' CRS-2673: Attempting to stop 'ora.diskmon' on 'node1' CRS-2677: Stop of 'ora.gipcd' on 'node1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1' CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded CRS-2677: Stop of 'ora.diskmon' on 'node1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@node1 grid]#
So, all the Grid Infrastructure services are down on node1. I will run an OCR Backup from node2 and verify it's location.
[root@node2 grid]# ocrconfig -manualbackup node2 2014/08/03 21:49:02 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr [root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup* -rw------- 1 root root 8024064 Aug 3 21:49 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr [root@node2 grid]#
Yes, the backup got created on node2 now.
Question : Would there have been a way to create a backup on node2 without shutting down node1 ?
.
.
.
3 comments:
Hemant,
This is a great post. On several occasions while teaching customers about RAC, specifically GI, I'm asked how the backups for OCR are setup. Once I explain how it works, the next question usually about the master node.
Your post shows how a new "master" node is automatically configured when the current one is unavailable - something I will point my future students to for sure!
Cheers,
Maaz
ocrconfig -export ?
or evict master node manually and rejoin again ?
-Nitin
Great post .. Thank you
Krishna
Post a Comment