10 June, 2009

Why EXPLAIN PLAN should not be used with Bind Variables

Oracle's EXPLAIN PLAN feature is very (far too) frequently used. It is available in three ways :
1. With "SET AUTOTRACE" in SQLPlus
2. With the command-line option "EXPLAIN=username/password" when running tkprof against a Trace File
3. Explicitly as "EXPLAIN PLAN FOR ...."

However, not enough people know (and, frankly I didn't till a few years ago !) that Explain Plan ignores Bind Variables. You can pass an SQL statement that uses Binds to an EXPLAIN but it the command will ignore the Bind -- it effectively ignores Data Skew. Since Bind Variable Peeking was introduced in 9i, *runtime* SQL does *not* ignore Bind Variables (that there are other issues with Bind Variable Peeking is, so to say, being minimalist ; but let's put aside those issues for the time being).
What this means is that while the Runtime Execution Plan may be plan 'A', EXPLAIN PLAN may "incorrectly" show you plan 'B'.

Here is a simple demonstration of the difference :

I first setup my test data :

SQL> create table demo_explain_bind
2 (owner varchar2(30) not null , object_name varchar2(30) not null , created date)
3 /
SQL>
SQL> alter table demo_explain_bind nologging;
SQL>
SQL> insert /*+ APPEND */ into demo_explain_bind
2 select owner, object_name, created from dba_objects where object_id is not null
3 union all
4 select owner, object_name, created from dba_objects where object_id is not null
5 union all
6 select owner, object_name, created from dba_objects where object_id is not null
7 union all
8 select owner, object_name, created from dba_objects where object_id is not null
9 /
SQL>
SQL> create index demo_explain_bind_ndx on demo_explain_bind (owner) nologging;
SQL>
SQL> exec dbms_stats.gather_table_stats('','DEMO_EXPLAIN_BIND',-
> estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE 250',cascade=>TRUE,no_invalidate=>FALSE);
SQL>
SQL> select /*+ FULL (d) */ count(*) from demo_explain_bind d;

COUNT(*)
----------
202492
SQL> select count(*) from demo_explain_bind where owner = 'SYS';

COUNT(*)
----------
91752
SQL> select count(*) from demo_explain_bind where owner = 'HEMANT';

COUNT(*)
----------
72
SQL> select count(distinct(owner)) from demo_explain_bind;

COUNT(DISTINCT(OWNER))
----------------------
26
SQL> select count(*)/count(distinct(owner)) Avg_per_Owner from demo_explain_bind;

AVG_PER_OWNER
-------------
7788.15385
SQL> select distinct(owner) from demo_explain_bind order by 1;

OWNER
------------------------------
BI
CTXSYS
DBSNMP
DMSYS
EXFSYS
HEMANT
HR
IX
MDSYS
OE
OLAPSYS
ORACLE_OCM
ORDPLUGINS
ORDSYS
OUTLN
PM
PUBLIC
SCOTT
SH
SI_INFORMTN_SCHEMA
SYS
SYSMAN
SYSTEM
TSMSYS
WMSYS
XDB

SQL> select num_rows, sample_size, blocks from user_tables where table_name = 'DEMO_EXPLAIN_BIND';

NUM_ROWS SAMPLE_SIZE BLOCKS
---------- ----------- ----------
202492 202492 1227
SQL> select num_rows, sample_size, blevel, leaf_blocks from user_indexes where index_name = 'DEMO_EXPLAIN_BIND_NDX';

NUM_ROWS SAMPLE_SIZE BLEVEL LEAF_BLOCKS
---------- ----------- ---------- -----------
202492 202492 2 468
SQL>


Thus, I have table with 202,492 rows. This contains a list of 26 distinct "owner"s, returning an arithmetic average of 7,788 rows per "owner". However, the data is really skewed, with "SYS" accounting for 91,752 rows and "HEMANT" accounting for 72 rows.

Let's start comparing Runtime with Explain Plan :

The first Test Set "A" is with Literals instead of Bind Variables :


SQL> REM ######################################################################
SQL> REM ######################################################################
SQL> REM REM REM REM : Test Set A
SQL> REM ############# Run with Histograms
SQL> REM ######################################################
SQL> REM First, let's compare Actual Blocks and Rows statistics and Explain with Literals
SQL>
SQL> -- First we use Autotrace to get both Actual Blocks and Rows statistics and Explain
SQL> set autotrace traceonly explain statistics
SQL> select object_name from demo_explain_bind where owner='SYS';

Execution Plan
----------------------------------------------------------
Plan hash value: 249876175

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 91752 | 2777K| 337 (1)| 00:00:05 |
|* 1 | TABLE ACCESS FULL| DEMO_EXPLAIN_BIND | 91752 | 2777K| 337 (1)| 00:00:05 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1223 consistent gets
0 physical reads
0 redo size
2333602 bytes sent via SQL*Net to client
690 bytes received via SQL*Net from client
20 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
91752 rows processed

SQL> select object_name from demo_explain_bind where owner='HEMANT';

Execution Plan
----------------------------------------------------------
Plan hash value: 2077489054

-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72 | 2232 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEMO_EXPLAIN_BIND | 72 | 2232 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | DEMO_EXPLAIN_BIND_NDX | 72 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OWNER"='HEMANT')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
1901 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
72 rows processed

SQL>
For 'SYS', EXPLAIN PLAN expects 91,752 rows. This matches the real count (which we see as "rows processed"). Again, for "HEMANT", the expected and actual cardinalities are the same -- 72.

Now, let's run those queries using Bind Variables :


SQL> REM ######################################################
SQL> REM Now, let's compare Actual Blocks and Rows statistics and Explain with a Bind Value
SQL>
SQL> variable b_owner_1 varchar2(30);
SQL> variable b_owner_2 varchar2(30);
SQL>
SQL> exec :b_owner_1 := 'SYS';
SQL> exec :b_owner_2 := 'HEMANT';
SQL>
SQL> -- First we use Autotrace to get both Actual Blocks and Rows statistics and Explain
SQL> set autotrace traceonly explain statistics
SQL> select object_name from demo_explain_bind where owner=:b_owner_1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2077489054

-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7788 | 235K| 117 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEMO_EXPLAIN_BIND | 7788 | 235K| 117 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | DEMO_EXPLAIN_BIND_NDX | 7788 | | 20 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OWNER"=:B_OWNER_1)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1223 consistent gets
0 physical reads
0 redo size
2333602 bytes sent via SQL*Net to client
690 bytes received via SQL*Net from client
20 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
91752 rows processed

SQL> select object_name from demo_explain_bind where owner=:b_owner_2;

Execution Plan
----------------------------------------------------------
Plan hash value: 2077489054

-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7788 | 235K| 117 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEMO_EXPLAIN_BIND | 7788 | 235K| 117 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | DEMO_EXPLAIN_BIND_NDX | 7788 | | 20 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OWNER"=:B_OWNER_2)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
1901 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
72 rows processed

SQL>

Do you see the significant difference ? For 'SYS', the EXPLAIN PLAN expected cardinality was 7.788 rows. AutoTrace tells me that the query for :b_owner_1 being 'SYS' was executed as an Index Range Scan because only 7,788 rows were expected. However, the *real* runtime execution was a FullTableScan and *NOT* an Index Range Scan. How do I know so ? Firstly, the 'consistent gets' count of 1,223 blocks indicates a FullTableScan. Secondly, I ran that again with tracing enabled. A tkprof (without EXPLAIN) shows the Row Source Operations to be :

Rows Row Source Operation
------- ---------------------------------------------------
91752 TABLE ACCESS FULL DEMO_EXPLAIN_BIND (cr=1223 pr=0 pw=0 time=275422 us)


Therefore, it is clear that what AUTOTRACE shows me as Oracle using an Index Range Scan for the query "select object_name from demo_explain_bind where owner=:b_owner_1;" is a lie. AutoTrace shows me an EXPLAIN PLAN, and not the real Execution Plan. RowSourceOperations on the trace file show me the real Execution Plan.
For the second query, using a diffrent bind variable (:b_owner_2) for 'HEMANT', AutoTrace shows me exactly the same plan ! Why ? Because EXPLAIN PLAN is "blind to binds". The actual plan for this was an IndexRangeScan but had to do only 14 consistent gets to fetch the 72 rows.

How did the EXPLAIN PLAN portion of SET AUTOTRACE come up with the figure of 7,788 rows forLink both :b_owner_1 and :b_owner_2 ? (hint : I've shown that figure earlier). EXPLAIN PLAN merely assumed equal distribution of rows across the 26 "owner"s !
.
This was Test Case "A". In my testing I also included EXPLAIN PLAN calls to confirm that SET AUTOTRACE was showing me EXPLAIN PLAN output. I've excluded those additional calls from the presentation above.
.
I have other Test Cases "B", "C" and "D". Where I go without Histograms, with Dynamic Sampling and without Dynamic Sampling.
.
I may present those other Test Cases over the next few days.
.
Happy EXPLAINing !
.
.
.

11 comments:

Jacco H. Landlust said...

Did you also try the tests with "explain plan for select...." ?

Timur Akhmadeev said...

>Explain Plan ignores Bind Variables
Which is stated in docs :)

Jaggi said...

Hi Hemant,
Can you help me how to migrate sql server to oracle database. Even i dont know how to start. Please help me to do this.

Hemant K Chitale said...

Jacco,
Yes, whether you use "EXPLAIN PLAN FOR select ..." OR you use tkprof explain= OR you use set autotrace .. explain, Oracle does the same thing. They are all the same EXPLAIN PLAN call.

In 10g, "SET AUTOTRACE .. EXPLAIN" implicitly does the "select * from table(dbms_xplan.display)" which you would run if you did "EXPLAIN PLAN FOR select ..."

Hemant

Hemant K Chitale said...

Timur,

Yes, those mere 5 sentences at 19.1.4 of the Performance Tuning Guide are such a "wet blanket" !

Hemant K Chitale said...

Jaggi,

I haven't worked with SQLServer or done a migration to Oracle.
However, Oracle's SQL Developer does have some free migration tools.
See http://www.oracle.com/technology/tech/migration//workbench/index_sqldev_omwb.html

Anonymous said...

It is always interesting to see playing with oracle. Thank you for posting this type of topic.
Arju

Ganesh said...

Hemant,

This is a nice post. In such circumstances, would it be useful to have histograms collected on the skewed columns, so that explain plan has enough information to come up with a more accurate plan and closer to the actual execution plan?

-Ganesh

Hemant K Chitale said...

Ganesh,
No, that won't help. Irrespective of whether a histogram exists or not, EXPLAIN PLAN ignores the binds.
However, if the data is skewed and the application uses binds you run into another issue -- caused by bind variable peeking at runtime *- only at the first execution and not at subsequent executions -* in 9i and 10g.

Hemant

Unknown said...

We are seeing the same issue. A Query which should use Unique Index scan is doing full table scan when we use Bind Variables. I am in Oracle 11g. What is the workaround for it? How do we make Oracle use good execution path? Please help!

Hemant K Chitale said...

Sujit,

If you are really sure that the Unique Index Scan is the correct method to be used, you should use SQL Plan Management to "fix" the execution plan.

However, a Unique Index Scan is to be used generally when accessing a single row. Are you sure that this is the method you want ?

Hemant