15 April, 2011

Standby Databases (aka "DataGuard") -1

(before I return to my series on Deterministic functions ........... )


UPDATE March 2017:  See my new series on SingleInstance DataGuard for RAC, beginning here.

This week I have had two sets of questions about standby databases -- what a coincidence ! The last time I really worked with a Standby was in 8.1.7 -- without the DataGuard "backport" from 9i. (and, before that, in 7.0). All of a sudden, I have had questions about 10g and 11 DG.

Anyways, since I had started "exploring" 11.2 DG (skipping over 9i and 10g) a few days ago, and had setup a DG environment in my VM, here are some entries from the Standby's alert.log :
Fri Apr 15 22:32:52 2011
RFS[12]: Assigned to RFS process 3778
RFS[12]: Identified database type as 'physical standby': Client is LGWR SYNC pid 2576
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Standby controlfile consistent with primary
RFS[12]: No standby redo logfiles selected (reason:7)
Errors in file /home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_rfs_3778.trc:
ORA-16086: Redo data cannot be written to the standby redo log
Fri Apr 15 22:33:14 2011
RFS[13]: Assigned to RFS process 3790
RFS[13]: Identified database type as 'physical standby': Client is ARCH pid 2658
Fri Apr 15 22:33:34 2011
RFS[14]: Assigned to RFS process 3792
RFS[14]: Identified database type as 'physical standby': Client is ARCH pid 2645
RFS[14]: Opened log for thread 1 sequence 30 dbid 159971472 branch 748133139
Archived Log entry 24 added for thread 1 sequence 30 rlc 748133139 ID 0x988c490 dest 2:
Fri Apr 15 22:33:54 2011
RFS[15]: Assigned to RFS process 3794
RFS[15]: Identified database type as 'physical standby': Client is ARCH pid 2658
Fri Apr 15 22:34:20 2011
RFS[16]: Assigned to RFS process 3806
RFS[16]: Identified database type as 'physical standby': Client is LGWR SYNC pid 2576
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Standby controlfile consistent with primary
RFS[16]: No standby redo logfiles selected (reason:7)
Errors in file /home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_rfs_3806.trc:
ORA-16086: Redo data cannot be written to the standby redo log
Fri Apr 15 22:34:41 2011
RFS[17]: Assigned to RFS process 3810
RFS[17]: Identified database type as 'physical standby': Client is ARCH pid 2658 

For the present, I am ignoring the ORA-16086 "errors" -- because I haven't followed the "advice" to create Standby Redo Logs (why make things too complicated, eh ?).

Here's what I have on my Primary :
SQL> select dest_id, max(sequence#) from v$archived_log group by dest_id;

   DEST_ID MAX(SEQUENCE#)
---------- --------------
         1             32
         2             32

SQL>
SQL> select * from v$archive_gap;

no rows selected

SQL>   

That looks fine to me. Right ? No Gap.

I am also ignoring these errors on my Standby (heh ! this is a "play and learn" environment !)
Fatal NI connect error 12537, connecting to:
 (LOCAL=NO)

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.1.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.1.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.1.0 - Production
  Time: 15-APR-2011 22:39:14
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12537
    
TNS-12537: TNS:connection closed
    ns secondary err code: 12560
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
opiodr aborting process unknown ospid (3905) as a result of ORA-609 



What is the point of this blog post ? ......... All in due course.

.
.
.
.
.
.
.



Next, I shutdown my primary :
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> 


While the primary shutdown is proceeding, it "flags" these errors :
Fri Apr 15 22:39:35 2011
ARC3: Archival disabled due to shutdown: 1014
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_arc3_2660.trc:
ORA-01014: ORACLE shutdown in progress
Master background archival failure: 1014
ARCH shutting down
ARC3: Archival stopped 


I then re-start the primary and, in due course, see these messages in the primary alert.log :
Fri Apr 15 22:44:05 2011
ARC0 started with pid=21, OS id=4137 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
LGWR: Primary database is in MAXIMUM AVAILABILITY mode
LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR
Fri Apr 15 22:44:06 2011
ARC2 started with pid=20, OS id=4150 
Fri Apr 15 22:44:06 2011
ARC3 started with pid=23, OS id=4152 
Fri Apr 15 22:44:06 2011
ARC1 started with pid=22, OS id=4148 
Fri Apr 15 22:44:07 2011
NSS2 started with pid=24, OS id=4156 
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH 
Fri Apr 15 22:44:36 2011
ORA-16198: LGWR received timedout error from KSR
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_lgwr_4066.trc:
ORA-16198: Timeout incurred on internal channel during remote archival
LGWR: Error 16198 verifying archivelog destination LOG_ARCHIVE_DEST_2
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Continuing...
Thread 1 advanced to log sequence 36 (thread open)
Thread 1 opened at log sequence 36
  Current log# 3 seq# 36 mem# 0: /addtl/oracle/oradata/PROD/onlinelog/o1_mf_3_6t3jnv7s_.log
  Current log# 3 seq# 36 mem# 1: /addtl/oracle/flash_recovery_area/PROD/onlinelog/o1_mf_3_6t3jnxcg_.log
Successful open of redo thread 1 


I then :
SQL> alter system switch logfile;

System altered.

SQL>  

and see :
Fri Apr 15 22:45:38 2011
NSS2 started with pid=29, OS id=4225 
Fri Apr 15 22:46:07 2011
ORA-16198: LGWR received timedout error from KSR
Errors in file /home/oracle/app/oracle/diag/rdbms/prod/prod/trace/prod_lgwr_4066.trc:
ORA-16198: Timeout incurred on internal channel during remote archival
Error 16198 for archive log file 1 to 'stdby'
LGWR: Failed to archive log 1 thread 1 sequence 37 (16198)
Thread 1 advanced to log sequence 37 (LGWR switch)
  Current log# 1 seq# 37 mem# 0: /addtl/oracle/oradata/PROD/onlinelog/o1_mf_1_6t3jnm57_.log
  Current log# 1 seq# 37 mem# 1: /addtl/oracle/flash_recovery_area/PROD/onlinelog/o1_mf_1_6t3jnn4d_.log
Fri Apr 15 22:46:13 2011
Archived Log entry 56 added for thread 1 sequence 36 ID 0x988c490 dest 1: 


Soon, I will learn how to perform "maintenance" on the primary with the standby shutdown.

.
.

Returning to this thread ....
I perform recovery on the standby :
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area  836976640 bytes
Fixed Size                  1339740 bytes
Variable Size             494931620 bytes
Database Buffers          335544320 bytes
Redo Buffers                5160960 bytes
Database mounted.
SQL> !pwd
/home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace

SQL> recover automatic standby database;
ORA-00279: change 857509 generated at 04/15/2011 22:47:47 needed for thread 1
ORA-00289: suggestion :
/addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_38_%u_.arc
ORA-00280: change 857509 for thread 1 is in sequence #38
ORA-00278: log file
'/addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_38_%u_.ar
c' no longer needed for this recovery
ORA-00308: cannot open archived log
'/addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_38_%u_.ar
c'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3


Specify log: {=suggested | filename | AUTO | CANCEL} 


and monitor it from the stdby alert.log :
Fri Apr 15 22:52:11 2011
ALTER DATABASE RECOVER  automatic standby database  
Media Recovery Start
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_10_6t66l9o7_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_11_6t66kp40_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_12_6t66nrbs_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_13_6t66htwv_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_14_6t66pmn5_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_15_6t66odb9_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_16_6t66tj5m_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_17_6t66svwc_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_18_6t678fyd_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_19_6t67cbgg_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_20_6t67bq2d_.arc
Fri Apr 15 22:52:21 2011
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_21_6t67ftgj_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_22_6t687w1g_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_23_6t687w19_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_24_6t688qdf_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_11/o1_mf_1_25_6t689byn_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_26_6tjol0jr_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_27_6tjokddr_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_28_6tjonl8w_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_29_6tjoq9v5_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_30_6tjosydp_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_31_6tjowowv_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_32_6tjoyvg2_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_33_6tjp21o3_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_34_6tjpppjj_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_35_6tjpopc6_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_36_6tjpqch2_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_37_6tjprmmt_.arc
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_38_%u_.arc
Errors with log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_15/o1_mf_1_38_%u_.arc
ORA-279 signalled during: ALTER DATABASE RECOVER  automatic standby database  ... 


So far so good .....

.
.
.
Updates to this "voyage of discovery" to follow ....

.
.
UPDATE : Adding a Standby Redo Log on the Standby :
SQL> recover managed standby database disconnect from session;
Media recovery complete.
SQL> recover managed standby database cancel;
Media recovery complete.
SQL> select group#, status, bytes/1048576 from v$log;

    GROUP# STATUS           BYTES/1048576
---------- ---------------- -------------
         1 CLEARING_CURRENT            50
         3 CLEARING                    50
         2 CLEARING                    50

SQL> select group#, status, bytes/1048576 from v$standby_log;

no rows selected

SQL> alter database add standby logfile size 50M;

Database altered.

SQL> select group#, status, bytes/1048576 from v$standby_log;

    GROUP# STATUS     BYTES/1048576
---------- ---------- -------------
         4 UNASSIGNED            50

SQL> recover managed standby database disconnect from session;
Media recovery complete.
SQL> 


following which the messages are :
Sat Apr 16 13:16:16 2011
ALTER DATABASE RECOVER  managed standby database disconnect from session  
Attempt to start background Managed Standby Recovery process (stdby)
Sat Apr 16 13:16:16 2011
MRP0 started with pid=27, OS id=4392 
MRP0: Background Managed Standby Recovery process started (stdby)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_16/o1_mf_1_74_6tl9j689_.arc
Media Recovery Waiting for thread 1 sequence 75
Completed: ALTER DATABASE RECOVER  managed standby database disconnect from session  
Sat Apr 16 13:16:34 2011
ALTER DATABASE RECOVER  managed standby database cancel  
Sat Apr 16 13:16:35 2011
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_mrp0_4392.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
Errors in file /home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_mrp0_4392.trc:
ORA-16037: user requested cancel of managed recovery operation
MRP0: Background Media Recovery process shutdown (stdby)
Waiting for MRP0 pid 4392 to terminate
Managed Standby Recovery Canceled (stdby)
Completed: ALTER DATABASE RECOVER  managed standby database cancel  
Sat Apr 16 13:16:40 2011
RFS[94]: Assigned to RFS process 4394
RFS[94]: Identified database type as 'physical standby': Client is LGWR SYNC pid 3274
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Standby controlfile consistent with primary
RFS[94]: No standby redo logfiles selected (reason:7)
Errors in file /home/oracle/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_rfs_4394.trc:
ORA-16086: Redo data cannot be written to the standby redo log
Sat Apr 16 13:17:02 2011
RFS[95]: Assigned to RFS process 4415
RFS[95]: Identified database type as 'physical standby': Client is ARCH pid 3349
Sat Apr 16 13:17:22 2011
RFS[96]: Assigned to RFS process 4417
RFS[96]: Identified database type as 'physical standby': Client is ARCH pid 3363
RFS[96]: Opened log for thread 1 sequence 75 dbid 159971472 branch 748133139
Archived Log entry 69 added for thread 1 sequence 75 rlc 748133139 ID 0x988c490 dest 2:
Sat Apr 16 13:17:39 2011
alter database add standby logfile size 50M
Completed: alter database add standby logfile size 50M
Sat Apr 16 13:17:42 2011
RFS[97]: Assigned to RFS process 4421
RFS[97]: Identified database type as 'physical standby': Client is ARCH pid 3349
Sat Apr 16 13:18:03 2011
ALTER DATABASE RECOVER  managed standby database disconnect from session  
Attempt to start background Managed Standby Recovery process (stdby)
Sat Apr 16 13:18:03 2011
MRP0 started with pid=28, OS id=4433 
MRP0: Background Managed Standby Recovery process started (stdby)
Sat Apr 16 13:18:08 2011
RFS[98]: Assigned to RFS process 4431
RFS[98]: Identified database type as 'physical standby': Client is LGWR SYNC pid 3274
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Standby controlfile consistent with primary
RFS[98]: Selected log 4 for thread 1 sequence 77 dbid 159971472 branch 748133139
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_16/o1_mf_1_75_6tl9m2fl_.arc
Media Recovery Waiting for thread 1 sequence 76
Completed: ALTER DATABASE RECOVER  managed standby database disconnect from session  
Sat Apr 16 13:18:29 2011
RFS[99]: Assigned to RFS process 4435
RFS[99]: Identified database type as 'physical standby': Client is ARCH pid 3363
RFS[99]: No standby redo logfiles of size 81 blocks available
RFS[99]: Opened log for thread 1 sequence 76 dbid 159971472 branch 748133139
Archived Log entry 70 added for thread 1 sequence 76 rlc 748133139 ID 0x988c490 dest 2:
Sat Apr 16 13:18:34 2011
Media Recovery Log /addtl/oracle/flash_recovery_area/STDBY/archivelog/2011_04_16/o1_mf_1_76_6tl9o5bb_.arc
Media Recovery Waiting for thread 1 sequence 77 (in transit) 



So, now my Standby Database does have a Standby Redo Log.
Note : I was able to work without a Standby Redo Log but I had almost no transactions generating Redo.

.
.
.

2 comments:

Aks said...

Hi,

I am facing RFS process issue. Not able to start the RFS process. Please assist.

Thanks.

Hemant K Chitale said...

For such an error, you should approach Oracle Support.