31 January, 2011

Gather Stats Concurrently

Finally ! Oracle has introduced a "CONCURRENT" preference in DBMS_STATS routines to allow a gather stats on multiple objects concurrently.
Thus a GATHER_SCHEMA_STATS call can be made to have Table Statistics gathered on multiple tables concurrently.
In earlier versions, the DEGREE (of parallelism) would only apply to large tables. Thus, in a schema of some thousands of tables, the GATHER_SCHEMA_STATS call would run through all the tables serially. Only at some large tables would it use Parallelism.

The introduction of the CONCURRENT preference allows the use of Parallel operators to work on multiple tables concurrently.

The Optimizer team has released a short write-up on this.
.
.
.

29 January, 2011

Synchronising Recovery of two databases

A recent forums question was about backup and recovery to be synchronised for two databases.

To reply, I used my knowledge from the "Distributed Databases" documentation in V7 and how we used to implement Advanced Replication.

When Oracle initiates a Distributed Transaction, it synchronises the SCNs for the databases that are part of the transaction. Essentially, a "Global SCN", which is the highest SCN amongst all those databases participating in the transaction, is selected and this is set in all the databases. Each database may subsequently follow it's own frequency of updating the SCN (i.e. based on the rate of transactions) until the next Distributed Transaction occurs, when the SCNs are again reset to the highest in the group.

Note : SCN synchronization occurs when
a. A connection occurs using a DBLink
b. A distributed SQL statement occurs (yes, even a SELECT !)
c. A distributed transaction commits

Even the V7 Administrator's Guide covered how important it was for Co-ordinated Recovery of Distributed Databases --- using SCNs.


Here's a short, simple demonstration of the synchronisation.
Lines in Italics are on database 'ORCL'.
Lines in BOLD are on database 'DUPDB'.

Initially, the SCN in 'DUPDB' was 6395626, far below the 6501471 in 'ORCL'.
However, as soon as I began distributed transactions, the SCNs were synchronized -- 'DUPDB' caught up with 'ORCL'.

Later when 'DUPDB's SCN had gone to a much higher 6501805 (because there were other "local" transactions in this database) and ran a distributed transaction to bring the SCN to 6502089, it was 'ORCL's turn to catch-up on the SCN


SQL> create database link DUPDB connect to hemant identified by dupdb using 'dupdb';

Database link created.

SQL>

SQL> create database link ORCL connect to hemant identified by orcl using 'ORCL';

Database link created.

SQL>

SQL> create table IN_ORCL (col_1 number, col_2 timestamp, col_3 varchar(30));

Table created.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
6501471

SQL>

SQL> create table IN_DUPDB (col_1 number, col_2 timestamp, col_3 varchar(30));

Table created.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
6395626

SQL>

SQL> insert into in_dupdb@dupdb values (1,systimestamp,'FROM ORCL');

1 row created.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
6501531

SQL>

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
6501537

SQL>

SQL> select systimestamp, current_scn, name from v$database;

SYSTIMESTAMP
---------------------------------------------------------------------------
CURRENT_SCN NAME
----------- ---------
28-JAN-11 11.59.05.566584 PM +08:00
6501585 ORCL


SQL>

SQL> l
1* select systimestamp, current_scn, name from v$database
SQL> /

SYSTIMESTAMP
---------------------------------------------------------------------------
CURRENT_SCN NAME
----------- ---------
28-JAN-11 11.59.59.867294 PM +08:00
6501805 DUPDB


SQL> insert into in_orcl@orcl values (1,systimestamp,'FROM DUPDB');

1 row created.

SQL> select systimestamp, current_scn, name from v$database;

SYSTIMESTAMP
---------------------------------------------------------------------------
CURRENT_SCN NAME
----------- ---------
29-JAN-11 12.00.46.522963 AM +08:00
6502089 DUPDB


SQL>

SQL> select systimestamp, current_scn, name from v$database;

SYSTIMESTAMP
---------------------------------------------------------------------------
CURRENT_SCN NAME
----------- ---------
29-JAN-11 12.00.49.642064 AM +08:00
6502089 ORCL





Note : It does not matter if the database "initiating" a distributed transaction has a lower or a higher SCN than the other database. Oracle just selects the highest SCN (higher of the two in a 2-database transaction, higher of the 'n' in an n-database transaction) and applies it to all the databases.

.
.
.

28 January, 2011

Transaction Failure --- when is the error returned ?

There was a question in the orafaq forums
"What happens when UNDO space is full? Transaction takes more time to ERROR?"
where my response was to the effect that the user sees the error message only after the rollback is completed.
Thus,
a) if a transaction began at 1pm
b) but, at 3pm, found that it had run out of space in the Undo tablespace (e.g. the datafile is fully occupied and does not have autoextend enable) and other undo extents / segments cannot be expired and reused
c) the user/client does not see the ORA-30036 error at 3pm but only after the (automatic) rollback of the transaction is completed -- which could be at 3:45pm.

Therefore, the time you see the error is not really the exact time when the error occurred. You are presented the error message only after the rollback is completed and the database is returned to a consistent state.

Here's a simple demonstration. I run two "large" transactions (and have a relatively very small Undo tablespace) and check the timestamps of the transactions and messages :

00:17:43 SQL>
00:17:43 SQL> select count(*) from sales_lines;

COUNT(*)
----------
100000

00:17:43 SQL>
00:17:43 SQL> drop table sales_lines_multi purge;

Table dropped.

00:17:44 SQL> create table sales_lines_multi as select * from sales_lines where 1=2;

Table created.

00:17:44 SQL>
00:17:44 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.17.44.385819 AM +08:00

00:17:44 SQL> insert into sales_lines_multi
00:17:44 2 select * from sales_lines
00:17:44 3 union all
00:17:44 4 select * from sales_lines
00:17:44 5 union all
00:17:44 6 select * from sales_lines
00:17:44 7 union all
00:17:44 8 select * from sales_lines
00:17:44 9 /

400000 rows created.

00:17:47 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.17.47.414854 AM +08:00

00:17:47 SQL>
00:17:47 SQL> commit;

Commit complete.

00:17:47 SQL> exec dbms_lock.sleep(10);

PL/SQL procedure successfully completed.

00:17:57 SQL>
00:17:57 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.17.57.500389 AM +08:00

00:17:57 SQL> delete sales_lines_multi;
delete sales_lines_multi
*
ERROR at line 1:
ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO'


00:18:04 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.18.04.386362 AM +08:00

00:18:04 SQL>
00:18:04 SQL> exec dbms_lock.sleep(10);

PL/SQL procedure successfully completed.

00:18:14 SQL>
00:18:14 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.18.14.412227 AM +08:00

00:18:14 SQL> rollback;

Rollback complete.

00:18:14 SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
28-JAN-11 12.18.14.437548 AM +08:00

00:18:14 SQL>
00:18:14 SQL> select count(*) from sales_lines_multi;

COUNT(*)
----------
400000

00:18:14 SQL>

Thus, I had an INSERT transaction that ran from 12.17.44.385819 to 12.17.47.414854 -- for approximately 3seconds. This transaction completed successfully.

Next, I had a DELETE transaction that ran from 12.17.57.500389 and reported an ORA-30036 slightly before 12.18.04.386362. Therefore, it seems to have run for about 7seconds before it failed. However, did the session actually fail on undo space at 12.18.04 ?

To monitor this, I wrote and ran a procedure :

create or replace procedure log_transaction_proc
as
runcnt number;
cursor c1 is select rownum from dual where rownum < addr="s.taddr" username =" 'HEMANT';"

This procedure wrote the the "log_transaction" table approximately every 5centi-seconds if it found an active transaction by the user 'HEMANT'.
UPDATE : I notice that there's a bug in the procedure which causes it to run in an infinite loop. However, that doesn't invalidate the findings that this blog post identifies (the procedure does write to the log_transaction table !).


These were some of the rows in the log_transaction table :

T_STAMP USED_UREC USED_UBLK
--------------------------------------------------------------------------- ---------- ----------
28-JAN-11 12.17.43.900921 AM 41 3
28-JAN-11 12.17.44.365687 AM 48 1
28-JAN-11 12.17.44.477116 AM 66 2
28-JAN-11 12.17.44.532674 AM 175 4
28-JAN-11 12.17.44.586143 AM 344 8
................
28-JAN-11 12.17.47.280307 AM 8102 164
28-JAN-11 12.17.47.335687 AM 8156 165
28-JAN-11 12.17.47.388669 AM 8390 169
28-JAN-11 12.17.57.593152 AM 2133 52
28-JAN-11 12.17.57.647491 AM 3915 94

Note how it found NO transaction between 12.17.47.388669 and 12.17.57.593152. Essentially, the first transaction ended some time after 12.17.47.388669 but within 5centiseconds after that !
This first transaction was the INSERT of 400,000 rows that actually required only slightly over 8,390 undo records and 169 undo blocks. (The Undo for an INSERT is actually very small).

Next, I saw these entries for the DELETE :

T_STAMP USED_UREC USED_UBLK
--------------------------------------------------------------------------- ---------- ----------
28-JAN-11 12.17.57.593152 AM 2133 52
28-JAN-11 12.17.57.647491 AM 3915 94
28-JAN-11 12.17.57.702068 AM 5897 142
28-JAN-11 12.17.57.756098 AM 8273 198
28-JAN-11 12.17.57.810300 AM 10216 245
................
28-JAN-11 12.18.01.841294 AM 119177 2839
28-JAN-11 12.18.01.894567 AM 121401 2892
28-JAN-11 12.18.01.947189 AM 121523 2895
28-JAN-11 12.18.02.000468 AM 118341 2819
28-JAN-11 12.18.02.053156 AM 115310 2747
................
28-JAN-11 12.18.04.086817 AM 14787 353
28-JAN-11 12.18.04.143813 AM 11674 279
28-JAN-11 12.18.04.200892 AM 8378 201
28-JAN-11 12.18.04.259882 AM 5457 131
28-JAN-11 12.18.04.319280 AM 2189 53


The transaction kept growing only upto sometime within 5centiseconds after 12.18.01.947189 when it was 121,523 undo rows. (We cannot know how many table rows did get deleted at this point). However, the transaction started shrinking by 12.18.02.000468. Why did it start shrinking ? Because it had failed ! (here we discover later that the failure was ORA-30036). The transaction kept shrinking --- i.e. it kept undergoing the "automatic" rollback until sometime after 12.18.04.319280. We know from the SQL execution log that the ORA-30036 error was returned to the client just before 12.18.04.386362.

Therefore, the transaction actually went out of Undo space between 12.18.01.947189 and 12.18.02.000468. The user/client/application saw the error message only slightly before 12.18.04.386362 --- i.e. more than 2 seconds later. This difference was the time it took Oracle to actually rollback the failed transaction.

I have seen cases where the time difference between when a transaction actually ran out of undo space and when the error was reported to the user was 30minutes in some cases and more than an hour in more extreme cases.

Note : The ORA-30036 error does not get reported in the alert.log file.

.
.
.

15 January, 2011

GLOBAL TEMPORARY TABLEs and GATHER_TABLE_STATS


UPDATE NOTE :  These tests, published in January 2011 are for version 11.2.0.1   There are differences in 12.1 and above, so do not rely on these for 12.1 behaviour.

If you use a Global Temporary Table in your application code, watch where and how you run GATHER_TABLE_STATS on it.
(I prefer to NOT have statistics on a GTT and to delete statistics if present ; so that Oracle can do Dynamic Sampling against the table when using it in a query).

If your GTT has been defined as ON COMMIT DELETE ROWS, the GATHER_TABLE_STATS call will result in rows being deleted. This is because the GATHER_TABLE_STATS issues an implicit commit.
If your GTT has been defined as ON COMMIT PRESERVE ROWS, the GATHER_TABLE_STATS will not delete rows in the table.

Here is a demonstration with comments :

SQL> REM Demo impact of Gather Stats on GTTs
SQL>
SQL> drop table MY_GTT_DELETE ;
drop table MY_GTT_DELETE
*
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create global temporary table MY_GTT_DELETE (object_id number, object_name varchar2(30)) on commit DELETE rows;

Table created.

SQL> drop table MY_GTT_PRESERVE;
drop table MY_GTT_PRESERVE
*
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create global temporary table MY_GTT_PRESERVE (object_id number, object_name varchar2(30)) on commit PRESERVE rows;

Table created.

SQL> drop table MY_GTT_PARALLEL;
drop table MY_GTT_PARALLEL
*
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create global temporary table MY_GTT_PARALLEL (object_id number, object_name varchar2(30)) on commit PRESERVE rows;

Table created.

SQL>
SQL>
SQL> REM ###############################################################################################
SQL> REM -- first : a table that is defined as ON COMMIT DELETE ROWS
SQL> insert into MY_GTT_DELETE select object_id, object_name from dba_objects where object_id is not null;

76701 rows created.

SQL> -- verify that rows are present
SQL> select count(*) from MY_GTT_DELETE;

COUNT(*)
----------
76701

SQL> -- now gather stats
SQL> exec dbms_stats.gather_table_stats('','MY_GTT_DELETE');

PL/SQL procedure successfully completed.

SQL> -- look for the rows now
SQL> select count(*) from MY_GTT_DELETE;

COUNT(*)
----------
0

SQL> select num_rows from user_tables where table_name = 'MY_GTT_DELETE';

NUM_ROWS
----------
0

SQL> --- The table has no rows. Apparently the gather_table_stats issued a COMMIT
SQL> ---- This also means that any other DML that I had executed in my current session also got COMMITted !!
SQL>
SQL>
SQL> REM ###############################################################################################
SQL> REM -- second : a table that is defined as ON COMMIT PRESERVE ROWS
SQL> insert into MY_GTT_PRESERVE select object_id, object_name from dba_objects where object_id is not null;

76701 rows created.

SQL> -- verify that rows are present
SQL> select count(*) from MY_GTT_PRESERVE;

COUNT(*)
----------
76701

SQL> -- now gather stats
SQL> exec dbms_stats.gather_table_stats('','MY_GTT_PRESERVE');

PL/SQL procedure successfully completed.

SQL> -- look for the rows now
SQL> select count(*) from MY_GTT_PRESERVE;

COUNT(*)
----------
76701

SQL> select num_rows from user_tables where table_name = 'MY_GTT_PRESERVE';

NUM_ROWS
----------
76701

SQL> --- The table still has rows
SQL>
SQL> REM ###############################################################################################
SQL> REM Check to see if the rows persist across sessions
SQL> REM ---- they shouldn't !
SQL> disconnect
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> connect hemant/hemant
Connected.
SQL> select count(*) from MY_GTT_PRESERVE;

COUNT(*)
----------
0

SQL> select num_rows from user_tables where table_name = 'MY_GTT_PRESERVE';

NUM_ROWS
----------
76701

SQL> --- the table has no rows but the statistics reflect the earlier row count
SQL> exec dbms_stats.gather_table_stats('','MY_GTT_PRESERVE');

PL/SQL procedure successfully completed.

SQL> select num_rows from user_tables where table_name = 'MY_GTT_PRESERVE';

NUM_ROWS
----------
0

SQL>
SQL>
SQL> REM ###############################################################################################
SQL> REM -- last : a table where I use DEGREE in the GATHER_STATS
SQL> insert into MY_GTT_PARALLEL select object_id, object_name from dba_objects where object_id is not null;

76701 rows created.

SQL> -- verify that rows are present
SQL> select count(*) from MY_GTT_PARALLEL;

COUNT(*)
----------
76701

SQL> -- now gather stats
SQL> exec dbms_stats.gather_table_stats('','MY_GTT_PARALLEL',degree=>4,estimate_percent=>100);

PL/SQL procedure successfully completed.

SQL> -- look for the rows now
SQL> select count(*) from MY_GTT_PARALLEL;

COUNT(*)
----------
76701

SQL> --- The table still has rows
SQL> select num_rows from user_tables where table_name = 'MY_GTT_PARALLEL';

NUM_ROWS
----------
76701

SQL>
SQL> REM ###############################################################################################
SQL> REM Can anyone tell my why the 3rd DROP TABLE fails here ?
SQL> REM -- note that I have not disconnected and reconnected
SQL> DROP TABLE MY_GTT_DELETE;

Table dropped.

SQL> DROP TABLE MY_GTT_PRESERVE;

Table dropped.

SQL> DROP TABLE MY_GTT_PARALLEL;
DROP TABLE MY_GTT_PARALLEL
*
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use


SQL>


Thus, the GATHER_TABLE_STATS call on MY_GTT_DELETE actually deleted rows.
.
.
.

09 January, 2011

Rollback of Transaction(s) after SHUTDOWN ABORT

There has been a recent forums thread on what happens in terms of recovery when a database instance is restarted after a SHUTDOWN ABORT.

It is clear that "crash recovery" has to be completed before the Instance can be OPENed for use. Thus, users cannot login, queries cannot be started and DML / DDL cannot be executed until after the Roll-forward of Redo has been completed (as the OPEN is executed only after re-applying all Redo).

This is an example of "crash recovery" :

Sun Jan 09 00:50:23 2011
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 80458 KB redo, 1721 data blocks need recovery
Started redo application at
Thread 1: logseq 490, block 52626
Recovery of Online Redo Log: Thread 1 Group 1 Seq 490 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 491 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 492 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo03.log
Completed redo application of 15.26MB
Completed crash recovery at
Thread 1: logseq 492, block 31015, scn 6082434
1721 data blocks read, 1721 data blocks written, 80458 redo k-bytes read
Sun Jan 09 00:50:27 2011
LGWR: STARTING ARCH PROCESSES
Sun Jan 09 00:50:27 2011
ARC0 started with pid=20, OS id=4173
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 493 (thread open)
Thread 1 opened at log sequence 493
Current log# 1 seq# 493 mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Jan 09 00:50:28 2011
SMON: enabling cache recovery
Sun Jan 09 00:50:28 2011
ARC2 started with pid=22, OS id=4177
Sun Jan 09 00:50:28 2011
ARC1 started with pid=21, OS id=4175
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Sun Jan 09 00:50:29 2011
ARC3 started with pid=23, OS id=4179
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Archived Log entry 319 added for thread 1 sequence 492 ID 0x49469b3f dest 1:
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Sun Jan 09 00:50:34 2011
Starting background process QMNC
Sun Jan 09 00:50:34 2011
QMNC started with pid=25, OS id=4184
Completed: ALTER DATABASE OPEN


However, "crash recovery" does NOT complete "Transaction Recovery". Notice the line "SMON: enabling tx recovery" in the alert.log. This is the point when SMON starts beginning to Rollback any transactions that had not completed or not committed (or had not rolled back) before the SHUTDOWN ABORT.

Let me re-iterate : "crash recovery" is only the application of redo from redo logs to datafiles -- which include undo tablespace files. Redo Logs do not necessarily only have committed transaction. Uncommitted transactions also get written to Redo (e.g. when DBWR writes a "dirty buffer" to disk, the Redo for it has to be written to the Redo Log files before the datafile is updated --- irrespective of whether the "dirty buffer" represents a committed or uncommited update).
Therefore, Roll-forward only brings the database to (almost) the same state it was when the SHUTDOWN ABORT was issued --- including updates that had not committed.

Since Oracle's job is to guarantee database consistency and transaction consistency, any uncommitted transactions must be rolled-back. It cannot allow users to see inconsistent data when the database is OPENed. (Imagine that the incomplete transaction had managed to update a DEBIT account and this had got written to disk but the CREDIT did not go to disk --- users querying the data after the restart cannot be allowed to see only the DEBIT , the transaction must be rolled-back, the DEBIT must be undone, because the CREDIT has not been done).
This process of Rollback is "Transaction Recovery" which you see as "tx recovery" in the alert.log. In earlier version (upto some V7 version, I can't recall which exactly), the Database Instance would not OPEN until Transaction Recovery was completed. When Oracle introduced deferred transaction recovery (deferred rollback), it allowed the database instance to OPEN but still guaranteed that uncommitted data would not be seen. A user process that accesses a modified block "sees" that the ITL entry points to a transaction and, from the corresponding Undo entry (identified from the specific slot in the specific Undo segment), also "sees" that the transaction was not committed. A "rollback" has to be effected. In this case, it is the user's Server Process that actually executes the rollback.

(NOTE : In the case where neither portion of the incomplete / uncommitted transaction got written to the database files, the user process does not even "see" that the data had been modified -- any modifications in memory were lost when the database instance was SHUTDOWN ABORT. So our case of a user's Server Process having to rollback a transaction that was issued (but not completed or committed) before the SHUTDOWN ABORT happens only if the dirty buffers had been written to disk.)

Here I show a very large transaction :

SQL> select sn.name, ss.value from v$statname sn, v$mystat ss
2 where sn.statistic#=ss.statistic#
3 and sn.name in ('redo size','undo change vector size')
4 /

NAME VALUE
---------------------------------------------------------------- ----------
redo size 784
undo change vector size 136

SQL> update store_list_new set country='SYS_UPD' where country='SYS';

11100960 rows updated.

SQL> select sn.name, ss.value from v$statname sn, v$mystat ss
2 where sn.statistic#=ss.statistic#
3 and sn.name in ('redo size','undo change vector size')
4 /

NAME VALUE
---------------------------------------------------------------- ----------
redo size 5220534820
undo change vector size 1778390016

SQL>
SQL>

that is executed and completed but not commited.
(for the purpose of this demo, I allow the UPDATE to complete, however, it doesn't matter if the UPDATE hadn't even completed when the SHUTDOWN ABORT was issued --- in either case, the Rollback is after the DATABASE OPEN).
A SHUTDOWN ABORT, followed by a STARTUP is issued from a different session :

[oracle@localhost ~]$ sqlplus '/ as sysdba'

SQL*Plus: Release 11.1.0.7.0 - Production on Sun Jan 9 00:48:24 2011

Copyright (c) 1982, 2008, Oracle. All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> shutdown abort;
ORACLE instance shut down.
SQL>
SQL> startup ;
ORACLE instance started.

Total System Global Area 313860096 bytes
Fixed Size 1336232 bytes
Variable Size 272632920 bytes
Database Buffers 33554432 bytes
Redo Buffers 6336512 bytes
Database mounted.
Database opened.
SQL>

with these messages from the alert.log :

Sun Jan 09 00:48:30 2011
Shutting down instance (abort)
License high water mark = 7
USER (ospid: 4076): terminating the instance
Instance terminated by USER, pid = 4076
Sun Jan 09 00:48:31 2011
Instance shutdown complete
Sun Jan 09 00:49:29 2011
Starting ORACLE instance (normal)


The "crash recovery" appears as :

Sun Jan 09 00:50:18 2011
ALTER DATABASE MOUNT
Successful mount of redo thread 1, with mount id 1267731018
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
Sun Jan 09 00:50:23 2011
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 80458 KB redo, 1721 data blocks need recovery
Started redo application at
Thread 1: logseq 490, block 52626
Recovery of Online Redo Log: Thread 1 Group 1 Seq 490 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 491 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 492 Reading mem 0
Mem# 0: /home/oracle/app/oracle/oradata/orcl/redo03.log
Completed redo application of 15.26MB
Completed crash recovery at
Thread 1: logseq 492, block 31015, scn 6082434
1721 data blocks read, 1721 data blocks written, 80458 redo k-bytes read

(the rest of the messages, upto the DATABASE OPEN have been presented earlier in this blog post).

But what happens when the user retries his UPDATE after the database instance is OPEN ?

oracle@localhost ~]$ sqlplus

SQL*Plus: Release 11.1.0.7.0 - Production on Sun Jan 9 00:51:12 2011

Copyright (c) 1982, 2008, Oracle. All rights reserved.

Enter user-name: hemant/hemant

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> update store_list_new set country='SYS_UPD' where country='SYS';


In the normal course, the entire Rollback would be done by SMON. However, what happens now is that this UPDATE statement, as it reads the table's datablocks, sees (from the ITL entries in the block header) the rows it is interested as having been updated, reads the undo slots for the transaction and identifies that the previous transaction had not committed. It has to effect a rollback of the previous transaction before it can update the same rows again !

When I trace this UPDATE run, I see a number of wait entries like :

=====================
PARSING IN CURSOR #5 len=63 dep=0 uid=184 oct=6 lid=184 tim=1294563155660560 hv=3915004330 ad='2cf70638' sqlid='ff3g2nznpnfda'
update store_list_new set country='SYS_UPD' where country='SYS'
END OF STMT
PARSE #5:c=486927,e=13830396,p=397,cr=10544,cu=1,mis=1,r=0,dep=0,og=1,plh=2037152,tim=1294563155660554
WAIT #5: nam='db file scattered read' ela= 28880 file#=15 block#=171843 blocks=5 obj#=85765 tim=1294563155690619
WAIT #5: nam='db file sequential read' ela= 874 file#=3 block#=145603 blocks=1 obj#=0 tim=1294563155691915
WAIT #5: nam='db file sequential read' ela= 437 file#=3 block#=145602 blocks=1 obj#=0 tim=1294563155692708
WAIT #5: nam='db file sequential read' ela= 29147 file#=3 block#=245 blocks=1 obj#=0 tim=1294563155722107
WAIT #5: nam='db file sequential read' ela= 15050 file#=3 block#=244 blocks=1 obj#=0 tim=1294563155739114

*** 2011-01-09 00:52:37.181
WAIT #5: nam='enq: TX - row lock contention' ela= 1384462 name|mode=1415053318 usn 16 slot="524303" sequence="3878" tim="1294563157180358"
WAIT #5: nam='db file sequential read' ela= 220 file#=3 block#=145603 blocks=1 obj#=0 tim=1294563157193428
WAIT #5: nam='db file sequential read' ela= 153 file#=3 block#=145602 blocks=1 obj#=0 tim=1294563157195194
WAIT #5: nam='db file sequential read' ela= 117 file#=3 block#=245 blocks=1 obj#=0 tim=1294563157196317

File_ID 3 is the datafile of my UNDOTBS1 Tablepace, that is the active UNDO_TABLESPACE of the Database Instance. The Server Process of the user session executing the UPDATE is reading from the Undo segments.
(Why don't I see read waits on the STORE_LIST_NEW table itself ? A large number of buffers of the table have already been read into the Buffer Cache by SMON and also when doing dynamic sampling to parse the UPDATE statement, although some blocks still have to be read from disk and do appear as waits later in the trace file).

Recall that SMON does execute a Rollback. That is why the Server Process occassionally encounter waits like these :

WAIT #5: nam='enq: TX - row lock contention'


In fact, while the UPDATE was running, when I checked for the session's waits, I found :

SQL> l
1 select blocking_session, event, count(*) from v$active_session_history where session_id=46
2 and (blocking_session is not null and event is not null)
3* group by blocking_session, event
SQL> /

BLOCKING_SESSION EVENT COUNT(*)
---------------- ---------------------------------------------------------------- ----------
34 latch: enqueue hash chains 1
11 log file switch completion 1
10 free buffer waits 1
13 enq: TX - row lock contention 1014

SQL>

SID 13 is the SMON session.

Thus, both SMON and the Server Process for the user issuing the UPDATE are attempting to Rollback entries for the target table.

If you want to monitor the Rollback being done by SMON you can see this :

SQL> select * from v$fast_start_transactions ;

USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL PID CPUTIME PARENTUSN PARENTSLT
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------- ---------- ----------
PARENTSEQ XID PXID RCVSERVERS
---------- ---------------- ---------------- ----------
8 15 3878 RECOVERING 18315 135898 24 8809 0 0
0 08000F00260F0000 0000000000000000 1


SQL>

While the Transaction is being Recovered,the STATE is "RECOVERING". UNDOBLOCKSTOTAL is the total number of Undo Blocks that have to be processed to complete the Rollback. UNDOBLOCKSDONE is the number of Undo Blocks that have been completed as of the time the V$FAST_START_TRANSACTIONS view is queried.

Later, as I query the Statistics for the UPDATE session I see :

data blocks consistent reads - undo records applied 2,968,576

This statistic shows how many undo records that the Server Process itself has applied.

I killed the UPDATE after it had run for a long time but did get a tkprof of the partial run :

SQL ID: ff3g2nznpnfda
Plan Hash: 2037152
update store_list_new set country='SYS_UPD'
where
country='SYS'


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.10 2 2 0 0
Execute 1 109.34 10149.31 85082 3235208 5549965 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 109.37 10149.42 85084 3235210 5549965 0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 184

Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE STORE_LIST_NEW (cr=0 pr=0 pw=0 time=0 us)
2252371 TABLE ACCESS FULL STORE_LIST_NEW (cr=3212990 pr=54987 pw=0 time=756482560 us cost=34043 size=186285490 card=10957970)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
Disk file operations I/O 1 0.00 0.00
db file sequential read 74565 1.00 311.19
db file scattered read 2781 0.50 11.11
enq: TX - row lock contention 1559 8656.42 9868.94
rdbms ipc reply 25 0.00 0.02
reliable message 21 0.00 0.03
log file switch completion 26 0.27 1.80
latch: redo allocation 5 0.02 0.03
latch: cache buffers lru chain 4 0.02 0.05
latch: enqueue hash chains 3 0.14 0.14
latch free 1 0.00 0.00
enq: TX - contention 3 0.47 1.11
latch: object queue header operation 2 0.00 0.00
latch: row cache objects 4 0.05 0.10
buffer busy waits 3 0.00 0.00
log file switch (checkpoint incomplete) 4 0.05 0.12
********************************************************************************

NOTE : *Ignore* the elapsed time and wait time. It, unfortunately, includes time spent when the database was "frozen" because the FRA, holding ArchiveLogs, was full.
You can see the very high count of "current gets" and the 'db file sequential read' waits, most of which (about 67thousand, see below) were on reading from the Undo tablespace. Also note the high number of 'enq: TX - row lock contention' waits. This was the only transaction in the database. That 'row lock contention' was with SMON doing a rollback against the same rows !

[oracle@localhost trace]$ cat orcl_ora_4234.trc |grep 'file#=3' |wc -l
67046
[oracle@localhost trace]$




Thus, as you can see, "crash recovery" does not mean that the entire database is consistent. It only means that the database is returned to the same state ("partially dirty", except for uncommitted updates that never even got written to disk) as it was when the SHUTDOWN ABORT was issued.
Complete consistency is achieved only when transactions that were incomplete / uncommitted are Rolled-back -- by SMON and/or by Server Process's for User sessions that need to read the same rows.

.
.
.

Latches and Enqueues

This is my second "published" Diagnostics Note : Latches and Enqueues.
.
.
.

08 January, 2011

Incomplete Recovery

I intend to publish some diagnostic notes in PDF/PPS/DOC and other formats.


.
.
.



02 January, 2011

Most Popular Posts - Dec 10

Blogger's "Stats" feature shows that the 3 most popular posts in the past 30 days have been :

3. AUTOEXTEND ON Next Size : 197 pageviews

.
.
.