There has been a
recent forums thread on what happens in terms of recovery when a database instance is restarted after a SHUTDOWN ABORT.
It is clear that "crash recovery" has to be completed before the Instance can be OPENed for use. Thus, users cannot login, queries cannot be started and DML / DDL cannot be executed until after the Roll-forward of Redo has been completed (as the OPEN is executed only after re-applying all Redo).
This is an example of "crash recovery" :
Sun Jan 09 00:50:23 2011
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 80458 KB redo, 1721 data blocks need recovery
Started redo application at
Thread 1: logseq 490, block 52626
Recovery of Online Redo Log: Thread 1 Group 1 Seq 490 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 491 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 492 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo03.log
Completed redo application of 15.26MB
Completed crash recovery at
Thread 1: logseq 492, block 31015, scn 6082434
1721 data blocks read, 1721 data blocks written, 80458 redo k-bytes read
Sun Jan 09 00:50:27 2011
LGWR: STARTING ARCH PROCESSES
Sun Jan 09 00:50:27 2011
ARC0 started with pid=20, OS id=4173
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 493 (thread open)
Thread 1 opened at log sequence 493
Current log# 1 seq# 493 mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Jan 09 00:50:28 2011
SMON: enabling cache recovery
Sun Jan 09 00:50:28 2011
ARC2 started with pid=22, OS id=4177
Sun Jan 09 00:50:28 2011
ARC1 started with pid=21, OS id=4175
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Sun Jan 09 00:50:29 2011
ARC3 started with pid=23, OS id=4179
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Archived Log entry 319 added for thread 1 sequence 492 ID 0x49469b3f dest 1:
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Sun Jan 09 00:50:34 2011
Starting background process QMNC
Sun Jan 09 00:50:34 2011
QMNC started with pid=25, OS id=4184
Completed: ALTER DATABASE OPEN
However, "crash recovery" does NOT complete "Transaction Recovery". Notice the line "SMON: enabling tx recovery" in the alert.log. This is the point when SMON starts beginning to
Rollback any transactions that had not completed or not committed (or had not rolled back) before the SHUTDOWN ABORT.
Let me re-iterate : "crash recovery" is only the application of redo from redo logs to datafiles -- which include undo tablespace files. Redo Logs do not necessarily only have committed transaction. Uncommitted transactions also get written to Redo (e.g. when DBWR writes a "dirty buffer" to disk, the Redo for it has to be written to the Redo Log files before the datafile is updated --- irrespective of whether the "dirty buffer" represents a committed or uncommited update).
Therefore, Roll-forward only brings the database to (almost) the same state it was when the SHUTDOWN ABORT was issued --- including updates that had not committed.
Since Oracle's job is to guarantee database consistency and transaction consistency, any uncommitted transactions must be rolled-back. It cannot allow users to see inconsistent data when the database is OPENed. (Imagine that the incomplete transaction had managed to update a DEBIT account and this had got written to disk but the CREDIT did not go to disk --- users querying the data after the restart cannot be allowed to see only the DEBIT , the transaction must be rolled-back, the DEBIT must be undone, because the CREDIT has not been done).
This process of Rollback is "Transaction Recovery" which you see as "tx recovery" in the alert.log. In earlier version (upto some V7 version, I can't recall which exactly), the Database Instance would not OPEN until Transaction Recovery was completed. When Oracle introduced deferred transaction recovery (deferred rollback), it allowed the database instance to OPEN but still guaranteed that uncommitted data would not be seen. A user process that accesses a modified block "sees" that the ITL entry points to a transaction and, from the corresponding Undo entry (identified from the specific slot in the specific Undo segment), also "sees" that the transaction was not committed. A "rollback" has to be effected. In this case, it is the user's Server Process that actually executes the rollback.
(NOTE : In the case where neither portion of the incomplete / uncommitted transaction got written to the database files, the user process does not even "see" that the data had been modified -- any modifications in memory were lost when the database instance was SHUTDOWN ABORT. So our case of a user's Server Process having to rollback a transaction that was issued (but not completed or committed) before the SHUTDOWN ABORT happens only if the dirty buffers had been written to disk.)
Here I show a very large transaction :
SQL> select sn.name, ss.value from v$statname sn, v$mystat ss
2 where sn.statistic#=ss.statistic#
3 and sn.name in ('redo size','undo change vector size')
4 /
NAME VALUE
---------------------------------------------------------------- ----------
redo size 784
undo change vector size 136
SQL> update store_list_new set country='SYS_UPD' where country='SYS';
11100960 rows updated.
SQL> select sn.name, ss.value from v$statname sn, v$mystat ss
2 where sn.statistic#=ss.statistic#
3 and sn.name in ('redo size','undo change vector size')
4 /
NAME VALUE
---------------------------------------------------------------- ----------
redo size 5220534820
undo change vector size 1778390016
SQL>
SQL>
that is executed and completed but not commited.
(for the purpose of this demo, I allow the UPDATE to complete, however, it doesn't matter if the UPDATE hadn't even completed when the SHUTDOWN ABORT was issued --- in either case, the Rollback is after the DATABASE OPEN).
A SHUTDOWN ABORT, followed by a STARTUP is issued from a different session :
[oracle@localhost ~]$ sqlplus '/ as sysdba'
SQL*Plus: Release 11.1.0.7.0 - Production on Sun Jan 9 00:48:24 2011
Copyright (c) 1982, 2008, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> shutdown abort;
ORACLE instance shut down.
SQL>
SQL> startup ;
ORACLE instance started.
Total System Global Area 313860096 bytes
Fixed Size 1336232 bytes
Variable Size 272632920 bytes
Database Buffers 33554432 bytes
Redo Buffers 6336512 bytes
Database mounted.
Database opened.
SQL>
with these messages from the alert.log :
Sun Jan 09 00:48:30 2011
Shutting down instance (abort)
License high water mark = 7
USER (ospid: 4076): terminating the instance
Instance terminated by USER, pid = 4076
Sun Jan 09 00:48:31 2011
Instance shutdown complete
Sun Jan 09 00:49:29 2011
Starting ORACLE instance (normal)
The "crash recovery" appears as :
Sun Jan 09 00:50:18 2011
ALTER DATABASE MOUNT
Successful mount of redo thread 1, with mount id 1267731018
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
Sun Jan 09 00:50:23 2011
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 80458 KB redo, 1721 data blocks need recovery
Started redo application at
Thread 1: logseq 490, block 52626
Recovery of Online Redo Log: Thread 1 Group 1 Seq 490 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 491 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 492 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo03.log
Completed redo application of 15.26MB
Completed crash recovery at
Thread 1: logseq 492, block 31015, scn 6082434
1721 data blocks read, 1721 data blocks written, 80458 redo k-bytes read
(the rest of the messages, upto the DATABASE OPEN have been presented earlier in this blog post).
But what happens when the user retries his UPDATE after the database instance is OPEN ?
oracle@localhost ~]$ sqlplus
SQL*Plus: Release 11.1.0.7.0 - Production on Sun Jan 9 00:51:12 2011
Copyright (c) 1982, 2008, Oracle. All rights reserved.
Enter user-name: hemant/hemant
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> update store_list_new set country='SYS_UPD' where country='SYS';
In the normal course, the entire Rollback would be done by SMON. However, what happens now is that this UPDATE statement, as it reads the table's datablocks, sees (from the ITL entries in the block header) the rows it is interested as having been updated, reads the undo slots for the transaction and identifies that the previous transaction had not committed. It has to effect a rollback of the previous transaction before it can update the same rows again !
When I trace this UPDATE run, I see a number of wait entries like :
=====================
PARSING IN CURSOR #5 len=63 dep=0 uid=184 oct=6 lid=184 tim=1294563155660560 hv=3915004330 ad='2cf70638' sqlid='ff3g2nznpnfda'
update store_list_new set country='SYS_UPD' where country='SYS'
END OF STMT
PARSE #5:c=486927,e=13830396,p=397,cr=10544,cu=1,mis=1,r=0,dep=0,og=1,plh=2037152,tim=1294563155660554
WAIT #5: nam='db file scattered read' ela= 28880 file#=15 block#=171843 blocks=5 obj#=85765 tim=1294563155690619
WAIT #5: nam='db file sequential read' ela= 874 file#=3 block#=145603 blocks=1 obj#=0 tim=1294563155691915
WAIT #5: nam='db file sequential read' ela= 437 file#=3 block#=145602 blocks=1 obj#=0 tim=1294563155692708
WAIT #5: nam='db file sequential read' ela= 29147 file#=3 block#=245 blocks=1 obj#=0 tim=1294563155722107
WAIT #5: nam='db file sequential read' ela= 15050 file#=3 block#=244 blocks=1 obj#=0 tim=1294563155739114
*** 2011-01-09 00:52:37.181
WAIT #5: nam='enq: TX - row lock contention' ela= 1384462 name|mode=1415053318 usn 16 slot="524303" sequence="3878" tim="1294563157180358"
WAIT #5: nam='db file sequential read' ela= 220 file#=3 block#=145603 blocks=1 obj#=0 tim=1294563157193428
WAIT #5: nam='db file sequential read' ela= 153 file#=3 block#=145602 blocks=1 obj#=0 tim=1294563157195194
WAIT #5: nam='db file sequential read' ela= 117 file#=3 block#=245 blocks=1 obj#=0 tim=1294563157196317
File_ID 3 is the datafile of my UNDOTBS1 Tablepace, that is the active UNDO_TABLESPACE of the Database Instance. The Server Process of the user session executing the UPDATE is reading from the Undo segments.
(Why don't I see read waits on the STORE_LIST_NEW table itself ? A large number of buffers of the table have already been read into the Buffer Cache by SMON and also when doing dynamic sampling to parse the UPDATE statement, although some blocks still have to be read from disk and do appear as waits later in the trace file).
Recall that SMON does execute a Rollback. That is why the Server Process occassionally encounter waits like these :
WAIT #5: nam='enq: TX - row lock contention'
In fact, while the UPDATE was running, when I checked for the session's waits, I found :
SQL> l
1 select blocking_session, event, count(*) from v$active_session_history where session_id=46
2 and (blocking_session is not null and event is not null)
3* group by blocking_session, event
SQL> /
BLOCKING_SESSION EVENT COUNT(*)
---------------- ---------------------------------------------------------------- ----------
34 latch: enqueue hash chains 1
11 log file switch completion 1
10 free buffer waits 1
13 enq: TX - row lock contention 1014
SQL>
SID 13 is the SMON session.
Thus, both SMON and the Server Process for the user issuing the UPDATE are attempting to Rollback entries for the target table.
If you want to monitor the Rollback being done by SMON you can see this :
SQL> select * from v$fast_start_transactions ;
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL PID CPUTIME PARENTUSN PARENTSLT
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------- ---------- ----------
PARENTSEQ XID PXID RCVSERVERS
---------- ---------------- ---------------- ----------
8 15 3878 RECOVERING 18315 135898 24 8809 0 0
0 08000F00260F0000 0000000000000000 1
SQL>
While the Transaction is being Recovered,the STATE is "RECOVERING". UNDOBLOCKSTOTAL is the total number of Undo Blocks that have to be processed to complete the Rollback. UNDOBLOCKSDONE is the number of Undo Blocks that have been completed as of the time the V$FAST_START_TRANSACTIONS view is queried.
Later, as I query the Statistics for the UPDATE session I see :
data blocks consistent reads - undo records applied 2,968,576
This statistic shows how many undo records that the Server Process itself has applied.
I killed the UPDATE after it had run for a long time but did get a tkprof of the partial run :
SQL ID: ff3g2nznpnfda
Plan Hash: 2037152
update store_list_new set country='SYS_UPD'
where
country='SYS'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.10 2 2 0 0
Execute 1 109.34 10149.31 85082 3235208 5549965 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 109.37 10149.42 85084 3235210 5549965 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 184
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE STORE_LIST_NEW (cr=0 pr=0 pw=0 time=0 us)
2252371 TABLE ACCESS FULL STORE_LIST_NEW (cr=3212990 pr=54987 pw=0 time=756482560 us cost=34043 size=186285490 card=10957970)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
Disk file operations I/O 1 0.00 0.00
db file sequential read 74565 1.00 311.19
db file scattered read 2781 0.50 11.11
enq: TX - row lock contention 1559 8656.42 9868.94
rdbms ipc reply 25 0.00 0.02
reliable message 21 0.00 0.03
log file switch completion 26 0.27 1.80
latch: redo allocation 5 0.02 0.03
latch: cache buffers lru chain 4 0.02 0.05
latch: enqueue hash chains 3 0.14 0.14
latch free 1 0.00 0.00
enq: TX - contention 3 0.47 1.11
latch: object queue header operation 2 0.00 0.00
latch: row cache objects 4 0.05 0.10
buffer busy waits 3 0.00 0.00
log file switch (checkpoint incomplete) 4 0.05 0.12
********************************************************************************
NOTE : *Ignore* the elapsed time and wait time. It, unfortunately, includes time spent when the database was "frozen" because the FRA, holding ArchiveLogs, was full.
You can see the very high count of "current gets" and the 'db file sequential read' waits, most of which (about 67thousand, see below) were on reading from the Undo tablespace. Also note the high number of 'enq: TX - row lock contention' waits. This was the only transaction in the database. That 'row lock contention' was with SMON doing a rollback against the same rows !
[oracle@localhost trace]$ cat orcl_ora_4234.trc |grep 'file#=3' |wc -l
67046
[oracle@localhost trace]$
Thus, as you can see, "crash recovery" does not mean that the entire database is consistent. It only means that the database is returned to the same state ("partially dirty", except for uncommitted updates that never even got written to disk) as it was when the SHUTDOWN ABORT was issued.
Complete consistency is achieved only when transactions that were incomplete / uncommitted are Rolled-back -- by SMON and/or by Server Process's for User sessions that need to read the same rows.
.
.
.