27 September, 2009

SQLs in Functions : Performance Impact

Taking the same case as I had in my previous demo "SQLs in Functions : Each Execution is Independent", I now demonstrate how calling a Function (which executes other SQL) for each row, from an SQL itself, can prove to be very detrimental to performance.

Recall from the previous demo that the INSERT statement had inserted 50,639 rows into the target table. This means that the Function, itself, was called 50,639 times. Could I have avoided having to execute the Function 50,639 times ? Yes, the INSERT statement could have used a Join instead of the Function.

********************************************************************************

These are the results of using the Function :


SQL> set timing on
SQL> insert into obj_names_and_descr
2 select owner, object_name, object_type, lookup_obj_descr (object_type)
3 from dba_objects
4 order by owner
5 /

50639 rows created.

Elapsed: 00:00:16.23
SQL>

insert into obj_names_and_descr
select owner, object_name, object_type, lookup_obj_descr (object_type)
from dba_objects
order by owner

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.05 0 11 0 0
Execute 1 11.24 11.73 1900 6164 4868 50639
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 11.27 11.79 1900 6175 4868 50639

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64

Rows Row Source Operation
------- ---------------------------------------------------
50639 SORT ORDER BY (cr=359889 pr=0 pw=0 time=15694827 us)
50639 VIEW DBA_OBJECTS (cr=5385 pr=0 pw=0 time=1327397 us)
50639 UNION-ALL (cr=5385 pr=0 pw=0 time=1124834 us)
50638 FILTER (cr=5378 pr=0 pw=0 time=669040 us)
51827 HASH JOIN (cr=631 pr=0 pw=0 time=839952 us)
70 TABLE ACCESS FULL USER$ (cr=6 pr=0 pw=0 time=858 us)
51827 TABLE ACCESS FULL OBJ$ (cr=625 pr=0 pw=0 time=312171 us)
1801 TABLE ACCESS BY INDEX ROWID IND$ (cr=4747 pr=0 pw=0 time=116996 us)
2368 INDEX UNIQUE SCAN I_IND1 (cr=2370 pr=0 pw=0 time=54337 us)(object id 39)
1 NESTED LOOPS (cr=3 pr=0 pw=0 time=101 us)
1 INDEX FULL SCAN I_LINK1 (cr=1 pr=0 pw=0 time=39 us)(object id 107)
1 TABLE ACCESS CLUSTER USER$ (cr=2 pr=0 pw=0 time=45 us)
1 INDEX UNIQUE SCAN I_USER# (cr=1 pr=0 pw=0 time=17 us)(object id 11)

********************************************************************************

SELECT OBJ_DESCR
FROM
MY_OBJ_TYPES_LOOKUP WHERE OBJ_TYPE = :B1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 50639 1.54 1.42 0 1 0 0
Fetch 50639 2.77 2.60 0 354473 0 50639
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 101279 4.32 4.02 0 354474 0 50639

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64 (recursive depth: 1)
********************************************************************************


Notice how the query on MY_OBJ_TYPES_LOOKUP was executed 50,639 times !
Also, the CPU time for the "parent" INSERT statement itself is very high ! Each call to the Function required PLSQL. Within the Function, there was an SQL call. Oracle kept switching between SQL and PLSQL.

The INSERT took 16seconds to run.


********************************************************************************

These are the results of using a Join to perform the lookup :


SQL> set timing on
SQL> insert into obj_names_and_descr
2 select owner, object_name, object_type, obj_descr
3 from dba_objects, my_obj_types_lookup
4 where object_type = obj_type
5 order by owner
6 /

50639 rows created.

Elapsed: 00:00:00.64
SQL>

insert into obj_names_and_descr
select owner, object_name, object_type, obj_descr
from dba_objects, my_obj_types_lookup
where object_type = obj_type
order by owner

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 1 0.58 0.59 1900 6192 4873 50639
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.59 0.60 1900 6193 4873 50639

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64

Rows Row Source Operation
------- ---------------------------------------------------
50639 SORT ORDER BY (cr=5392 pr=0 pw=0 time=521559 us)
50639 HASH JOIN (cr=5392 pr=0 pw=0 time=1381766 us)
42 TABLE ACCESS FULL MY_OBJ_TYPES_LOOKUP (cr=7 pr=0 pw=0 time=175 us)
50639 VIEW DBA_OBJECTS (cr=5385 pr=0 pw=0 time=1021851 us)
50639 UNION-ALL (cr=5385 pr=0 pw=0 time=819289 us)
50638 FILTER (cr=5378 pr=0 pw=0 time=363509 us)
51827 HASH JOIN (cr=631 pr=0 pw=0 time=579119 us)
70 TABLE ACCESS FULL USER$ (cr=6 pr=0 pw=0 time=534 us)
51827 TABLE ACCESS FULL OBJ$ (cr=625 pr=0 pw=0 time=156502 us)
1801 TABLE ACCESS BY INDEX ROWID IND$ (cr=4747 pr=0 pw=0 time=63373 us)
2368 INDEX UNIQUE SCAN I_IND1 (cr=2370 pr=0 pw=0 time=26582 us)(object id 39)
1 NESTED LOOPS (cr=3 pr=0 pw=0 time=86 us)
1 INDEX FULL SCAN I_LINK1 (cr=1 pr=0 pw=0 time=31 us)(object id 107)
1 TABLE ACCESS CLUSTER USER$ (cr=2 pr=0 pw=0 time=39 us)
1 INDEX UNIQUE SCAN I_USER# (cr=1 pr=0 pw=0 time=12 us)(object id 11)

********************************************************************************


The Join based INSERT took less than a second to run.
It was executed as a single SQL call. Much faster.

The function makes sense for single-row lookups (e.g. when called from a user-screen). It should NOT be used for a batch update which attempts the lookup more than a few times -- in this case, 50,639 times ! If possible, replace PLSQL calls with straightforward SQL.
(That is Tom Kyte's mantra as well : If it is possible to do it in SQL, do it in SQL !").

.
.
.

20 September, 2009

SQLs in Functions : Each Execution is Independent

Functions (Stored PLSQL code that return a single value at each call) are good for calculations, validation against business rules etc. But when you use them for lookups, you must watch out for underlying data being changed !

Thus, if you have a statement like "SELECT col_1, my_function(col_2) FROM ...", the function "my_function" is executed from every row that is returned from the query. Oracle guarantees Read Consistency at the Statement Level. However, insofar as the Function is executed, each execution of the Function is a separate SQL statement executed against the database. If underlying data has been changed in transaction from a different database session, the Function may start returning different values mid-way in a result set.


Here's a demonstration of how a Function being used for a lookup returns different values for the same lookup, within a single SQL statement.


SQL> drop table my_obj_types_lookup purge;

Table dropped.

SQL> drop table obj_names_and_descr purge;

Table dropped.

SQL>
SQL> -- create the Lookups table
SQL> create table my_obj_types_lookup
2 (obj_type varchar2(30) not null, obj_descr varchar2(50) not null)
3 /

Table created.

SQL> insert into my_obj_types_lookup select distinct object_type, 'Is an ' || object_type from dba_objects;

42 rows created.

SQL> commit;

Commit complete.

SQL>
SQL> -- REM Create a function that does Lookups
SQL> create or replace function lookup_obj_descr (obj_type_in varchar2)
2 return varchar2
3 as
4 obj_type_ret varchar2(50);
5 begin
6 select obj_descr into obj_type_ret from my_obj_types_lookup where obj_type = obj_type_in;
7 return obj_type_ret;
8 exception when others then return 'Unknown object type';
9 end;
10 /

Function created.

SQL>
SQL> -- REM create the target table
SQL> create table obj_names_and_descr
2 (obj_owner varchar2(30) not null, obj_name varchar2(128) not null, obj_type varchar2(30) not null, obj_descr varchar2(60) not null);

Table created.

SQL>
SQL> pause PRESS ENTER TO PROCEED
PRESS ENTER TO PROCEED

SQL> -- Now we populate OBJ_NAMES_AND_DESCR using our Lookup Function !
SQL> insert into obj_names_and_descr
2 select owner, object_name, object_type, lookup_obj_descr (object_type)
3 from dba_objects
4 order by owner
5 /

50639 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> -- REM Let's verify the count of different types of objects
SQL> select obj_type, obj_descr, count(*)
2 from obj_names_and_descr
3 where obj_type like 'TAB%'
4 group by obj_type, obj_descr
5 order by 1,2
6 /

OBJ_TYPE OBJ_DESCR COUNT(*)
------------------------------ ------------------------------------------------------------ ----------
TABLE Is an TABLE 897
TABLE OOPS ! 759
TABLE PARTITION Is an TABLE PARTITION 128

SQL>
SQL>
SQL> -- REM REM REM #!#!
SQL> -- REM Hey ! Where did the "OOPS !" rows come from ? My INSERT INTO ... SELECT should have given my Read Consistency
SQL> -- REM Therefore, every TABLE should have the same OBJ_DESCR !
SQL>


Surely, the "insert into obj_names_and_descr" should have "seen" a Read Consistent image of data ? Yes. The image from the dba_objects view would have been consistent. However, the lookup function "lookup_obj_descr" was executed once against "my_obj_types_lookup" for *each* row from the query against dba_objects. Each execution of the function was independent of the other executions. It so happens that the underlying table "my_obj_types_lookup" was updated by someone else running :

SQL> update my_obj_types_lookup set obj_descr = 'OOPS !' where obj_type = 'TABLE';

1 row updated.

SQL> commit;

Commit complete.

SQL>

mid-way through the INSERT ... SELECT operation.

Therefore, although some rows that were returned from dba_objects saw the obj_descr as "Is an TABLE" for obj_type = 'TABLE' , rows that were still being fetched from dba_objects after the other session committed his update to 'OOPS !', now saw the new value 'OOPS !' as being the description.

Therefore, the SELECT portion of the INSERT actually returned different values for obj_descr when using the Function !

.
.
.

12 September, 2009

RMAN can identify and catalog / use ArchiveLogs automagically

When doing RESTORE DATABASE and RECOVER DATABASE using RMAN, you may find that RMAN automagically

a) CATALOGs files (BackupPieces and even ArchiveLogs) in the FRA, if you are using an FRA
b) Reads ArchiveLogs from the destination identified by log_archive_dest_1 if you use this instead of an FRA

Knowledge of this can also be used when you are restoring to another server where you haven't created an FRA. Copy/Restore your ArchiveLogs (using non-RMAN methods) to any alternate log_archive_dest directory, designate it as log_archive_dest_1 in the parameter file of the database you are creating (i.e. pretending to restore and recover on the new server) and let RMAN identify the files.

I had just posted an example of this in this forums thread.

.
.
.

07 September, 2009

Table and Partition Statistics

When gathering statistics on Partitioned Tables, it is important to know the difference between Table Level and Partition Level (and, if applicable, SubPartition Level) statistics.

The Optimizer will use Partition Level statistics if the query predicates are on the Partition Keys. However, if the query predicates are not on these columns, (OR even if on the Partition Key do not allow for pruning to a *single partition* -- see update of 08-Sep below) the Optimizer relies on Table Level Statistics.

ANALYZE would aggregate Table Level statistics from Partition Level statistics. DBMS_STATS allows gathering of both or either statistics.

Here is an example showing how the Optimizer can badly estimate cardinality on a non-partition key column in the absence of Table Level statistics.

I have used the demo SH.SALES table which is Partitioned by TIME_ID. However, CUST_ID is a very relevant *dimension* column.


SQL> desc sh.sales
Name Null? Type
------------------------------------------------------------------------ -------- -------------------------------------------------
PROD_ID NOT NULL NUMBER
CUST_ID NOT NULL NUMBER
TIME_ID NOT NULL DATE
CHANNEL_ID NOT NULL NUMBER
PROMO_ID NOT NULL NUMBER
QUANTITY_SOLD NOT NULL NUMBER(10,2)
AMOUNT_SOLD NOT NULL NUMBER(10,2)

SQL>
SQL> REM The SALES table is Range Partitioned by TIME_ID
SQL> REM CUST_ID is a Dimension
SQL>
SQL> -- Current Statistics
SQL> --- NOTE !! Warning : For simplicity of reading I have converted NULL statistics to 0 (zero) for NUM_ROWS
SQL> -- However, in reality there is a big difference between NULL NUM_ROWS (meaning no statistics) and 0 NUM_ROWS (really meaning zero rows) !
SQL> exec dbms_stats.delete_table_stats('SH','SALES');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CHANNEL_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROMO_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROD_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CUST_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_TIME_BIX');

PL/SQL procedure successfully completed.

SQL> select table_name, nvl(num_rows,0) from dba_tables where owner = 'SH' and table_name = 'SALES';

TABLE_NAME NVL(NUM_ROWS,0)
------------------------------ ---------------
SALES 0

SQL> select partition_name, nvl(num_rows,0) from dba_tab_partitions where table_owner = 'SH' and table_name = 'SALES' order by partition_position;

PARTITION_NAME NVL(NUM_ROWS,0)
------------------------------ ---------------
SALES_1995 0
SALES_1996 0
SALES_H1_1997 0
SALES_H2_1997 0
SALES_Q1_1998 0
SALES_Q2_1998 0
SALES_Q3_1998 0
SALES_Q4_1998 0
SALES_Q1_1999 0
SALES_Q2_1999 0
SALES_Q3_1999 0
SALES_Q4_1999 0
SALES_Q1_2000 0
SALES_Q2_2000 0
SALES_Q3_2000 0
SALES_Q4_2000 0
SALES_Q1_2001 0
SALES_Q2_2001 0
SALES_Q3_2001 0
SALES_Q4_2001 0
SALES_Q1_2002 0
SALES_Q2_2002 0
SALES_Q3_2002 0
SALES_Q4_2002 0
SALES_Q1_2003 0
SALES_Q2_2003 0
SALES_Q3_2003 0
SALES_Q4_2003 0

28 rows selected.

SQL>
SQL> REM REM ############################################################
SQL> -- Delete and re-gather statistics
SQL> -- statistics at partition level
SQL> exec dbms_stats.delete_table_stats('SH','SALES');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CHANNEL_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROMO_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROD_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CUST_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_TIME_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('SH','SALES',estimate_percent=>100,granularity=>'PARTITION',cascade=>TRUE,no_invalidate=>FALSE);

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows from dba_tables where owner = 'SH' and table_name = 'SALES';

TABLE_NAME NUM_ROWS
------------------------------ ----------
SALES 918843

SQL> select num_distinct, num_buckets, nvl(sample_size,0), histogram from dba_tab_col_statistics
2 where owner = 'SH' and table_name = 'SALES' and column_name = 'CUST_ID';

NUM_DISTINCT NUM_BUCKETS NVL(SAMPLE_SIZE,0) HISTOGRAM
------------ ----------- ------------------ ---------------
3203 1 0 NONE

SQL>
SQL>
SQL> explain plan for
2 select cust_id,time_id,quantity_sold
3 from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('31-JUL-2001','DD-MON-YYYY');

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 279964487

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 134K| 2094K| 104 (4)| 00:00:02 | | |
| 1 | PARTITION RANGE ITERATOR| | 134K| 2094K| 104 (4)| 00:00:02 | 17 | 19 |
|* 2 | TABLE ACCESS FULL | SALES | 134K| 2094K| 104 (4)| 00:00:02 | 17 | 19 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TIME_ID"<=TO_DATE(' 2001-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 14 rows selected. SQL>
SQL> select count(*) from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('31-JUL-2001','DD-MON-YYYY');

COUNT(*)
----------
145760

SQL>
SQL> explain plan for
2 select cust_id,time_id,quantity_sold
3 from sh.sales where cust_id in (10,100,1001);

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2821443835

----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 861 | 13776 | 226 (0)| 00:00:03 | | |
| 1 | PARTITION RANGE ALL | | 861 | 13776 | 226 (0)| 00:00:03 | 1 | 28 |
| 2 | INLIST ITERATOR | | | | | | | |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 861 | 13776 | 226 (0)| 00:00:03 | 1 | 28 |
| 4 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE | SALES_CUST_BIX | | | | | 1 | 28 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("CUST_ID"=10 OR "CUST_ID"=100 OR "CUST_ID"=1001)

17 rows selected.

SQL>
SQL> select count(*) from sh.sales where cust_id in (10,100,1001);

COUNT(*)
----------
252

SQL>
SQL>
SQL>
SQL> REM REM ############################################################
SQL>
SQL> -- Delete and re-gather statistics
SQL> -- statistics at table and partition level
SQL> exec dbms_stats.delete_table_stats('SH','SALES');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CHANNEL_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROMO_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_PROD_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_CUST_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_index_stats('SH','SALES_TIME_BIX');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('SH','SALES',estimate_percent=>100,granularity=>'ALL',cascade=>TRUE,no_invalidate=>FALSE);

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows from dba_tables where owner = 'SH' and table_name = 'SALES';

TABLE_NAME NUM_ROWS
------------------------------ ----------
SALES 918843

SQL> select num_distinct, num_buckets, nvl(sample_size,0), histogram from dba_tab_col_statistics
2 where owner = 'SH' and table_name = 'SALES' and column_name = 'CUST_ID';

NUM_DISTINCT NUM_BUCKETS NVL(SAMPLE_SIZE,0) HISTOGRAM
------------ ----------- ------------------ ---------------
7059 1 918843 NONE

SQL>
SQL> explain plan for
2 select cust_id,time_id,quantity_sold
3 from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('31-JUL-2001','DD-MON-YYYY');

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 279964487

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 134K| 2094K| 104 (4)| 00:00:02 | | |
| 1 | PARTITION RANGE ITERATOR| | 134K| 2094K| 104 (4)| 00:00:02 | 17 | 19 |
|* 2 | TABLE ACCESS FULL | SALES | 134K| 2094K| 104 (4)| 00:00:02 | 17 | 19 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TIME_ID"<=TO_DATE(' 2001-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 14 rows selected. SQL>
SQL> select count(*) from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('31-JUL-2001','DD-MON-YYYY');

COUNT(*)
----------
145760

SQL>
SQL> explain plan for
2 select cust_id,time_id,quantity_sold
3 from sh.sales where cust_id in (10,100,1001);

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2821443835

----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 390 | 6240 | 158 (0)| 00:00:02 | | |
| 1 | PARTITION RANGE ALL | | 390 | 6240 | 158 (0)| 00:00:02 | 1 | 28 |
| 2 | INLIST ITERATOR | | | | | | | |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 390 | 6240 | 158 (0)| 00:00:02 | 1 | 28 |
| 4 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE | SALES_CUST_BIX | | | | | 1 | 28 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("CUST_ID"=10 OR "CUST_ID"=100 OR "CUST_ID"=1001)

17 rows selected.

SQL>
SQL> select count(*) from sh.sales where cust_id in (10,100,1001);

COUNT(*)
----------
252

SQL>



Let's compare the two runs.

The first run of queries is with granularity=>'PARTITION'.
In this case, for the CUST_ID, Oracle estimates 3,203 distinct values and no Histogram. It actually did not gather statistics based on any sampling of CUST_IDs. !
For the query on CUST_ID IN (10,100,1001), Oracle estimates a Cardinality of 861 rows (while the actual count is 252 rows).

The second run of queries is with granularity=>'ALL'.
Here, for the CUST_ID, Oracle estimates 7,059 distinct values, based on a sampling of 918,843 rows. It hsa not yet created a Histogram.
For the query on CUST_ID IN (10,100,1001), Oracle estimates a Cardinality of 390 rows, which is closer to the actual count of 252 rows.

Thus, although gathering statistics only at the Partition Level may make sense where your query predicates are always based on the Partition Key and Partition Pruning can happen, Table Level statistics are necessary for queries where Partition Pruning cannot be done.

.
.

And I have not even started improving the quality of the column statistics with the correct FOR COLUMNS SIZE specification for METHOD_OPT !
.
.
UPDATE : 08-Sep : Randolf Geist has pointed out "I think it's important to point out that partition or subpartition level statistics only get used if the optimizer is able to prune to a single partition/subpartition. Otherwise the next level statistics (partition/global) will get used.".
This also ties in with MetaLink Note#166215.1

So if I rerun the test where the query will hit a *single* partition only, I get :

SQL> explain plan for
2 select cust_id,time_id,quantity_sold
3 from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('28-FEB-2001','DD-MON-YYYY');

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4096232376

------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 39925 | 623K| 35 (3)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 39925 | 623K| 35 (3)| 00:00:01 | 17 | 17 |
|* 2 | TABLE ACCESS FULL | SALES | 39925 | 623K| 35 (3)| 00:00:01 | 17 | 17 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TIME_ID"<=TO_DATE(' 2001-02-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 14 rows selected. SQL>
SQL> select count(*) from sh.sales where time_id between to_date('01-JAN-2001','DD-MON-YYYY') and to_date('28-FEB-2001','DD-MON-YYYY');

COUNT(*)
----------
39924

SQL>

Thus, when the query is against a single partition -- such that Oracle can apply partition pruning -- the statistics from that specific partition are used. The cardinality estimate is more accurate -- coming to 39,925 against the actual count of 39,924.


.
.
.

02 September, 2009

I am an Oracle ACE, officially

I am very grateful to Randolf Geist for having nominated me as an Oracle ACE. He even went further and followed up on the nomination and secured me the Award. Thank you !

I also thank Lillian Buziak of the Oracle ACE Program Office for informing me of the award and putting up my profile.

The Oracle ACE program recognises proficiency in Oracle technology as well as a willingness to share knowledge and experiences with the Oracle community.

My Oracle ACE profile is available via the Oracle ACE page.
.
.
.
Update 06-Sep : THANK YOU, everyone who has sent a congratulatory message, either on this blog or by email.
I will continue to be active on this blog and oracle forums.
.
.
.