26 September, 2010

Data Skew changing over time --- and the Cardinality Estimate as well !

What happens if the nature of "data skew" changes over time ? For example, in the first few years of operations, you manufacture and sell 4 types of widgets ("A", "B", "C" and "D") where the volume of sales is approximately equally distributed amongst the 4 (Sales may be increasing over the years, but the distribution -- i.e. proportions -- remains the same). After say, 5 years, you add new widgets ("X", "Y" and "Z") to your product line. You notice that sales for "A" and "B" are now actually declining while sales for "Z" over the next 3 years are growing much faster than "C", "D", "X" and "Y".
What you have is changing data skew.
How does the optimizer estimate cardinality ?

Column Statistics used for a Histogram are at the *Table* level. A Histogram of the frequency of occurrences of these widgets in your sales would be computed merely as the number of occurrences across the table -- without accounting for dates. At the end of year 4, Oracle may say that "A" accounted for 25% of sales. At the end of year 8, "A" may now account for only 5% of total sales across all 8 years. And that is why the histogram would say. However, the histogram would not now (at the end of 8 years) say that "A" still accounts for 25% of sales in the first 4 years ! The information content is now "normalised" -- the metadata loses some vital information. (Of course, actually querying the table still shows you that "A" accounts for 25% of sales in the first 4 years, but the Optimizer's statistics can no longer reflect this knowledge). The Optimizer's cardinality estimates after 8 years are actually "weaker" because it has "normalised" (or "averaged out") the skew.


I've takan the "SH" schema's SALES table (this is available as one of the EXAMPLE schemas) from an Oracle 10.2 installation to demonstrate this. I take the example of 3.5inch diskettes whose sales decline. To simplify the case, I count the number of Sales Orders, not the quantities across all orders (for example, we could assume that each Sales Order is for exactly 1000 diskettes).

First, I start with SALES information for the years 1998 to 2002 :
SQL> exec dbms_stats.gather_table_stats('','SALES',estimate_percent=>100,granularity=>'ALL',degree=>4,-
> method_opt=>'FOR ALL COLUMNS SIZE 250',cascade=>TRUE,no_invalidate=>FALSE);

PL/SQL procedure successfully completed.

SQL>
SQL> -- What is PROD_ID 31 ?
SQL> select prod_id, prod_name
  2  from products
  3  where prod_id=31
  4  /

   PROD_ID PROD_NAME
---------- --------------------------------------------------
        31 1.44MB External 3.5" Diskette

SQL>
SQL> -- What are the range of dates in the Sales (history) table ?
SQL> select min(time_id), max(time_id)
  2  from times
  3  /

MIN(TIME_ MAX(TIME_
--------- ---------
01-JAN-98 31-DEC-02

SQL>
SQL> -- What are the total number of Sales records by year ?
SQL> select /*+ FULL(s) PARALLEL (s 4) */ to_char(time_id,'YYYY'), count(*)
  2  from sales s
  3  group by to_char(time_id,'YYYY')
  4  order by 1
  5  /

TO_C   COUNT(*)
---- ----------
1998     178834
1999     247945
2000     232646
2001     259418

SQL>
SQL> -- How many sales exist for the range 01-Jan-1998 to 31-Dec-2001 ?
SQL> select count(*)
  2  from sales
  3  where time_id > to_date('31-DEC-1997','DD-MON-YYYY') and time_id < to_date('01-JAN-2002','DD-MON-YYYY')
  4  /

  COUNT(*)
----------
    918843

SQL>
SQL> -- How many sales of 3.5" diskettes were made in the range 01-Jan-1998 to 31-Dec-2001 ?
SQL> select prod_id, to_char(time_id,'YYYY'), count(*)
  2  from sales
  3  where
  4  prod_id=31
  5  and time_id > to_date('31-DEC-1997','DD-MON-YYYY') and time_id < to_date('01-JAN-2002','DD-MON-YYYY')
  6  group by prod_id,to_char(time_id,'YYYY')
  7  order by 1,2
  8  /

   PROD_ID TO_C   COUNT(*)
---------- ---- ----------
        31 1998       6602
        31 1999       6586
        31 2000       7568
        31 2001       2352

SQL>
SQL> -- How many sales 3.5" diskettes were made in  2002 ?
SQL> select prod_id, to_char(time_id,'YYYY'), count(*)
  2  from sales
  3  where
  4  prod_id=31
  5  and time_id > to_date('31-DEC-2001','DD-MON-YYYY') and time_id < to_date('01-JAN-2003','DD-MON-YYYY')
  6  group by prod_id,to_char(time_id,'YYYY')
  7  order by 1,2
  8  /

no rows selected

SQL> 

We know that gross Sales (for all products) have (generally) been increasing over the years -- from 179thousand orders in 1998 to 259thousand in 2001 (with a slight dip in the year 2000).
However, sales of 3.5" diskettes declined in 2001.
Sales for 2002 are not recorded.

We look at the Optimizer's cardinality estimates based on available statistics :
SQL> -- Verify the Cardinality Estimates by year
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1997','DD-MON-YYYY') and time_id < to_date('01-JAN-1999','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  4478 | 53736 |    22  (10)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  4478 | 53736 |    22  (10)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  4478 | 53736 |     6   (0)| 00:00:01 |     4 |     8 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  4478 | 53736 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |     4 |     8 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  4478 | 53736 |    15   (7)| 00:00:01 |     4 |     8 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  4478 | 53736 |    15   (7)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |     4 |     8 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID"              "TIME_ID">TO_DATE(' 1997-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1997-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1998','DD-MON-YYYY') and time_id < to_date('01-JAN-2000','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  6275 | 75300 |    27   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  6275 | 75300 |    27   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  6275 | 75300 |     6   (0)| 00:00:01 |     8 |    12 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  6275 | 75300 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |     8 |    12 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  6275 | 75300 |    20   (5)| 00:00:01 |     8 |    12 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  6275 | 75300 |    20   (5)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |     8 |    12 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID"              "TIME_ID">TO_DATE(' 1998-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1998-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1999','DD-MON-YYYY') and time_id < to_date('01-JAN-2001','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  5859 | 70308 |    25   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  5859 | 70308 |    25   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  5859 | 70308 |     6   (0)| 00:00:01 |    12 |    16 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  5859 | 70308 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |    12 |    16 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  5859 | 70308 |    18   (6)| 00:00:01 |    12 |    16 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  5859 | 70308 |    18   (6)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |    12 |    16 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID">TO_DATE(' 1999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2001-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2000','DD-MON-YYYY') and time_id < to_date('01-JAN-2002','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  6528 | 78336 |    27   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  6528 | 78336 |    27   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  6528 | 78336 |     6   (0)| 00:00:01 |    16 |    20 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  6528 | 78336 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |    16 |    20 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  6528 | 78336 |    20   (5)| 00:00:01 |    16 |    20 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  6528 | 78336 |    20   (5)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |    16 |    20 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID">TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2002-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2001','DD-MON-YYYY') and time_id < to_date('01-JAN-2003','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1295169200

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |    16 |   192 |     4   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR    |                |    16 |   192 |     4   (0)| 00:00:01 |    20 |    24 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                |    16 |   192 |     4   (0)| 00:00:01 |       |       |
|   3 |    BITMAP AND                |                |       |       |            |          |       |       |
|   4 |     BITMAP MERGE             |                |       |       |            |          |       |       |
|*  5 |      BITMAP INDEX RANGE SCAN | SALES_TIME_BIX |       |       |            |          |    20 |    24 |
|*  6 |     BITMAP INDEX SINGLE VALUE| SALES_PROD_BIX |       |       |            |          |    20 |    24 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("TIME_ID">TO_DATE(' 2001-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   6 - access("PROD_ID"=31)

20 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2002','DD-MON-YYYY') and time_id < to_date('01-JAN-2004','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2294783259

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                |    16 |   192 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR          |                |    16 |   192 |     2   (0)| 00:00:01 |    24 |    28 |
|*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |    16 |   192 |     2   (0)| 00:00:01 |    24 |    28 |
|   3 |    BITMAP CONVERSION TO ROWIDS     |                |       |       |            |          |       |       |
|*  4 |     BITMAP INDEX RANGE SCAN        | SALES_TIME_BIX |       |       |            |          |    24 |    28 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("PROD_ID"=31)
   4 - access("TIME_ID">TO_DATE(' 2002-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2004-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

18 rows selected.
SQL>
SQL> 


We can see that the estimates for PROD_ID=31 do not match reality. Particularly for 2001.

Let's now create 2002 data :
SQL> -- let's create  2002 sales data
SQL> alter index sales_time_bix modify partition sales_q1_2002 unusable;

Index altered.

SQL> alter index sales_time_bix modify partition sales_q2_2002 unusable;

Index altered.

SQL> alter index sales_time_bix modify partition sales_q3_2002 unusable;

Index altered.

SQL> alter index sales_time_bix modify partition sales_q4_2002 unusable;

Index altered.

SQL> alter index sales_prod_bix modify partition sales_q1_2002 unusable;

Index altered.

SQL> alter index sales_prod_bix modify partition sales_q2_2002 unusable;

Index altered.

SQL> alter index sales_prod_bix modify partition sales_q3_2002 unusable;

Index altered.

SQL> alter index sales_prod_bix modify partition sales_q4_2002 unusable;

Index altered.

SQL>
SQL>
SQL> insert /*+ APPEND */ into sales
  2  select prod_id, cust_id, time_id+365, channel_id, promo_id, quantity_sold, amount_sold
  3  from sales
  4  where 1=1
  5  and time_id > to_date('31-DEC-2000','DD-MON-YYYY') and time_id < to_date('01-JAN-2002','DD-MON-YYYY')
  6  and decode(prod_id,31,mod(cust_id,3),0)=0
  7  /

257877 rows created.

SQL> alter index sales_time_bix rebuild partition sales_q1_2002 ;

Index altered.

SQL> alter index sales_time_bix rebuild partition sales_q2_2002 ;

Index altered.

SQL> alter index sales_time_bix rebuild partition sales_q3_2002 ;

Index altered.

SQL> alter index sales_time_bix rebuild partition sales_q4_2002 ;

Index altered.

SQL> alter index sales_prod_bix rebuild partition sales_q1_2002 ;

Index altered.

SQL> alter index sales_prod_bix rebuild partition sales_q2_2002 ;

Index altered.

SQL> alter index sales_prod_bix rebuild partition sales_q3_2002 ;

Index altered.

SQL> alter index sales_prod_bix rebuild partition sales_q4_2002 ;

Index altered.

SQL>
SQL>
SQL> -- the number of total sales orders in 2002
SQL> select count(*)
  2  from sales
  3  where time_id > to_date('31-DEC-2001','DD-MON-YYYY') and time_id < to_date('01-JAN-2003','DD-MON-YYYY')
  4  /

  COUNT(*)
----------
    257877

SQL>
SQL> -- the number of sales orders for 3.5" diskettes in 2002
SQL> select count(*)
  2  from sales
  3  where time_id > to_date('31-DEC-2001','DD-MON-YYYY') and time_id < to_date('01-JAN-2003','DD-MON-YYYY')
  4  and prod_id=31
  5  /

  COUNT(*)
----------
       811

SQL> -- so we had only 811 sales orders for 3.5" diskettes
SQL>
SQL> -- Gather Statistics
SQL> exec dbms_stats.gather_table_stats('','SALES',estimate_percent=>100,granularity=>'ALL',degree=>4,-
> method_opt=>'FOR ALL COLUMNS SIZE 250',cascade=>TRUE,no_invalidate=>FALSE);

PL/SQL procedure successfully completed.

SQL> 


Thus, although gross sales for all products did not decline significantly in 2002, sales for 3.5" diskettes, at 811 orders, were only 1/3rd the number in 2001.

Let's revisit the Optimizer's cardinality estimates :
SQL> -- ReVerify the Cardinality Estimates by year
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1997','DD-MON-YYYY') and time_id < to_date('01-JAN-1999','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  3636 | 43632 |    22  (10)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  3636 | 43632 |    22  (10)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  3636 | 43632 |     6   (0)| 00:00:01 |     4 |     8 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  3636 | 43632 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |     4 |     8 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  3636 | 43632 |    15   (7)| 00:00:01 |     4 |     8 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  3636 | 43632 |    15   (7)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |     4 |     8 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID"              "TIME_ID">TO_DATE(' 1997-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1997-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1998','DD-MON-YYYY') and time_id < to_date('01-JAN-2000','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  5047 | 60564 |    27   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  5047 | 60564 |    27   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  5047 | 60564 |     6   (0)| 00:00:01 |     8 |    12 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  5047 | 60564 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |     8 |    12 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  5047 | 60564 |    20   (5)| 00:00:01 |     8 |    12 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  5047 | 60564 |    20   (5)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |     8 |    12 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID"              "TIME_ID">TO_DATE(' 1998-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1998-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-1999','DD-MON-YYYY') and time_id < to_date('01-JAN-2001','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  4755 | 57060 |    26   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  4755 | 57060 |    26   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  4755 | 57060 |     6   (0)| 00:00:01 |    12 |    16 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  4755 | 57060 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |    12 |    16 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  4755 | 57060 |    19   (6)| 00:00:01 |    12 |    16 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  4755 | 57060 |    19   (6)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |    12 |    16 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID"              "TIME_ID">TO_DATE(' 1999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 1999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2001-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2000','DD-MON-YYYY') and time_id < to_date('01-JAN-2002','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  5277 | 63324 |    28   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  5277 | 63324 |    28   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  5277 | 63324 |     6   (0)| 00:00:01 |    16 |    20 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  5277 | 63324 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |    16 |    20 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  5277 | 63324 |    21   (5)| 00:00:01 |    16 |    20 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  5277 | 63324 |    21   (5)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |    16 |    20 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID">TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2002-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2001','DD-MON-YYYY') and time_id < to_date('01-JAN-2003','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2009851720

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |  5250 | 63000 |    28   (8)| 00:00:01 |       |       |
|*  1 |  VIEW                          | index$_join$_001 |  5250 | 63000 |    28   (8)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                    |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR    |                  |  5250 | 63000 |     6   (0)| 00:00:01 |    20 |    24 |
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |  5250 | 63000 |     6   (0)| 00:00:01 |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |    20 |    24 |
|   6 |    PARTITION RANGE ITERATOR    |                  |  5250 | 63000 |    21   (5)| 00:00:01 |    20 |    24 |
|   7 |     BITMAP CONVERSION TO ROWIDS|                  |  5250 | 63000 |    21   (5)| 00:00:01 |       |       |
|*  8 |      BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX   |       |       |            |          |    20 |    24 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_ID"=31 AND "TIME_ID">TO_DATE(' 2001-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   2 - access(ROWID=ROWID)
   5 - access("PROD_ID"=31)
   8 - access("TIME_ID">TO_DATE(' 2001-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"              2003-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

25 rows selected.

SQL>
SQL>
SQL> explain plan for
  2  select prod_id
  3  from sales
  4  where prod_id=31
  5  and time_id > to_date('31-DEC-2002','DD-MON-YYYY') and time_id < to_date('01-JAN-2004','DD-MON-YYYY')
  6  /

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1295169200

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |    13 |   156 |     4   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR    |                |    13 |   156 |     4   (0)| 00:00:01 |    24 |    28 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                |    13 |   156 |     4   (0)| 00:00:01 |       |       |
|   3 |    BITMAP AND                |                |       |       |            |          |       |       |
|   4 |     BITMAP MERGE             |                |       |       |            |          |       |       |
|*  5 |      BITMAP INDEX RANGE SCAN | SALES_TIME_BIX |       |       |            |          |    24 |    28 |
|*  6 |     BITMAP INDEX SINGLE VALUE| SALES_PROD_BIX |       |       |            |          |    24 |    28 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("TIME_ID">TO_DATE(' 2002-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TIME_ID"   6 - access("PROD_ID"=31)

20 rows selected.

SQL> 


Oracle now seems to have continued "normalising" (or "averaging out") the distribution across all the years. The discrepancy between the actual count of rows and the estimated count of rows has increased significantly.
Actual versus Estimated Cardinality for PROD_ID=31


Year            Actual  Estimated  Estimated
                before inserting   after inserting
                Year 2002 rows     Year 2002 rows

1998              6602       4478       3636   
1999              6586       6275       5047
2000              7568       5859       4755
2001              2352       6528       5277
2002                 0         16         na
2002 inserted      811         na       5250
2003                 0         16         13
                                                               


Although the number of occurrences of Sales Orders for PROD_ID=31 has declined considerably in 2001 and 2002, the Optimizer cannot adjust for the decline. The discrepancy between Actual and Estimated row counts increases over time as the skew changes.
The Cardinality estimate is computed from Column statistics (number of distinct values, selectivity, histogram) at the Table level but combined with Partition row counts without any knowledge of the occurrences of PRODUCT_ID=31 in each of the target Partitions for the year 2002.


SUGGESTED TESTS For Readers :
1. What if we had a BTree Index ?
2. What if the SALES table wasn't Partitioned ?


UPDATE : See the next post, with a Non-Partitioned Table
.
.
.

19 September, 2010

Index Skip Scan

We know that an Index Skip Scan operation is when Oracle uses an Index of which the leading column is not a query predicate. For example, an index on (A,B,C) is not normally expected to be used for a query against 'B' and 'C'. Yet, Oracle may use the index if it finds that the "cost" isn't high.
How does it determine the suitability of the index ? Oracle has to know how many distinct entries for 'A' are present in the index. This information is not in the Index statistics. However, it is present in the Column statistics.

Here is a simple demonstration of Index Skip Scan.

I create an ACCOUNTS (say, Bank Accounts) table with 400thousand+ entries and an Index on ACCOUNT_TYPE, COUNTRY_CD, COUNTRY_CITY_CD. I find that Oracle can use this index for queries by COUNTRY_CD and COUNTRY_CITY_CD -- I do not (need to) have to create a separate index for COUNTRY_CD+COUNTRY_CITY_CD. Thus, I can reduce the number of indexes required. The Index has a "poor" Clustering Factor (as values for the COUNTRY_CD and COUNTRY_CITY_CD are "scattered" across the table, given the way I generated these values), yet the Index is useful.

Remember : The "costing" depends on the number of distinct ACCOUNT_TYPE values. Too many distinct value and Oracle may choose to not use the Index for the query by COUNTRY_CD and COUNTRY_CITY_CD.

First, I setup the data :

SQL> create table Accounts_Table (
2 account_id number not null,
3 country_cd number not null,
4 country_city_cd number not null,
5 account_type varchar(3),
6 account_dt date,
7 account_title varchar2(32))
8 /

Table created.

SQL>
SQL> alter table accounts_table nologging;

Table altered.

SQL>
SQL> --- create more than 400thousand accounts in 25 countrie with 60 country_city_codes and of 4 account_types
SQL> insert /*+ APPEND */ into accounts_table
2 select rownum,
3 trunc(dbms_random.value(1,26)),
4 trunc(dbms_random.value(1,61)),
5 decode(trunc(dbms_random.value(1,6)),1,'SAV',2,'CUR',3,'FDR',4,'MXD',5,'SAV',6,'SAV'),
6 sysdate-3600+(rownum/10),
7 dbms_random.string('X',30)
8 from dual
9 connect by level < 401243
10 /

401242 rows created.

SQL>

I then create Indexes and test a few queries. (At this point I haven't gathered table statistics so the queries will help my "nudge" Oracle into making the right choices for it's default METHOD_OPT behaviour of 'FOR ALL COLUMNS SIZE AUTO').

SQL> create unique index accounts_table_pk on accounts_table(account_id) nologging;

Index created.

SQL> alter table accounts_table add constraint accounts_table_pk primary key (account_id);

Table altered.

SQL> create index accounts_table_n1 on accounts_table(account_type, country_cd, country_city_cd) nologging;

Index created.

SQL>
SQL> -- deliberately run queries so that METHOD_OPTs FOR ALL COUMNS SIZE AUTO selects these three columns
SQL> select count(*) from accounts_table where country_cd = 11;

COUNT(*)
----------
15833

SQL> select count(*) from accounts_table where country_cd = 18;

COUNT(*)
----------
16169

SQL> select count(*) from accounts_table where country_cd = 11 and country_city_cd = 24;

COUNT(*)
----------
255

SQL> select count(*) from accounts_table where country_cd = 18 and country_city_cd = 32;

COUNT(*)
----------
273

SQL> select count(*) from accounts_table where account_type = 'SAV';

COUNT(*)
----------
159963

SQL> select count(*) from accounts_table where account_type = 'MXD';

COUNT(*)
----------
80236

SQL>
SQL>
SQL> exec dbms_stats.gather_table_stats('','ACCOUNTS_TABLE',estimate_percent=>100);

PL/SQL procedure successfully completed.

SQL>


Now I test a query with statistics present :

SQL> explain plan for
2 select country_cd, country_city_cd, account_id
3 from accounts_table
4 where country_cd = 11 and country_city_cd = 24
5 /

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 512713334

-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 266 | 2926 | 270 (0)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| ACCOUNTS_TABLE | 266 | 2926 | 270 (0)| 00:00:04 |
|* 2 | INDEX SKIP SCAN | ACCOUNTS_TABLE_N1 | 266 | | 6 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("COUNTRY_CD"=11 AND "COUNTRY_CITY_CD"=24)
filter("COUNTRY_CITY_CD"=24 AND "COUNTRY_CD"=11)

15 rows selected.

SQL>

Why did Oracle choose the Index Skip Scan ?
Firstly, it expected to have to retrieve 266 rows from the table. These are actually not clustered together for the given COUNTRY_CD and COUNTRY_CITY_CD values. Therefore, the cost of the Table Access by Rowid is 264 (270 minus 6). However, a Table Access by Rowid is possible only if Oracle gets the Rowids from a suitable index. Here is where the existing Index is usable. The cost of getting those 264 Rowids is expected to be only 6 --- inspite of having to do a Skip Scan !
How did Oracle estimate a cost of 6 for the Index Skip Scan ? It factored in the knowledge that there are 4 distinct values for ACCOUNT_TYPE (the leading column of the Index). Therefore, it adjusts for having to "skip" over the Index 4 times. Oracle doesn't need to know the ACCOUNT_TYPEs. It has to "skip" to the first ACCOUNT_TYPE ('CUR') and then read the index, ignoring the leading column to find the desired COUNTRY_CD+COUNTRY_CITY_CD combination. Once it reaches the end of the entries for this combination (i.e. it finds that it is COUNTRY_CITY_CD 25 within COUNTRY_CD 11), it merely has to skip to the next ACCOUNT_TYPE ('FDR') in the Index (remember that it can lookup the values from Branches quicker !). Within this ACCOUNT_TYPE, it again has to search for the desired COUNTRY_CD+COUNTRY_CITY_CD combination. Thus, it will be searching for COUNTRY_CD=11,COUNTRY_CITY_CD=24 4 times in the Index.

The Event 10053 trace shows that the index has a poor Clustering Factor (396,788 for 401,242 rows !) :

Table Stats::
Table: ACCOUNTS_TABLE Alias: ACCOUNTS_TABLE
#Rows: 401242 #Blks: 3350 AvgRowLen: 53.00
Index Stats::
Index: ACCOUNTS_TABLE_N1 Col#: 4 2 3
LVLS: 2 #LB: 1177 #DK: 6000 LB/K: 1.00 DB/K: 66.00 CLUF: 396788.00
Index: ACCOUNTS_TABLE_PK Col#: 1
LVLS: 2 #LB: 837 #DK: 401242 LB/K: 1.00 DB/K: 1.00 CLUF: 3288.00

Column statistics on the two query predicates are correct :

Column (#2): COUNTRY_CD(NUMBER)
AvgLen: 3.00 NDV: 25 Nulls: 0 Density: 0.01973 Min: 1 Max: 25
Histogram: Freq #Bkts: 25 UncompBkts: 401242 EndPtVals: 25
Column (#3): COUNTRY_CITY_CD(NUMBER)
AvgLen: 3.00 NDV: 60 Nulls: 0 Density: 0.0081123 Min: 1 Max: 60
Histogram: Freq #Bkts: 60 UncompBkts: 401242 EndPtVals: 60
Table: ACCOUNTS_TABLE Alias: ACCOUNTS_TABLE
Card: Original: 401242 Rounded: 266 Computed: 266.16 Non Adjusted: 266.1

A FullTableScan has a cost of 909 :

Access Path: TableScan
Cost: 917.03 Resp: 917.03 Degree: 0
Cost_io: 909.00 Cost_cpu: 120492154
Resp_io: 909.00 Resp_cpu: 120492154

An IndexSkipScan has a cost of 6 :

Access Path: index (skip-scan)
SS sel: 6.6333e-04 ANDV (#skips): 4
SS io: 4.00 vs. table scan io: 909.00
Skip Scan chosen
Access Path: index (SkipScan)
Index: ACCOUNTS_TABLE_N1
resc_io: 270.00 resc_cpu: 2026919
ix_sel: 6.6333e-04 ix_sel_with_filters: 6.6333e-04
Cost: 270.14 Resp: 270.14 Degree: 1

This accounts for "Skips" in the Index (represented by "ANDV"). Of the 1,177 Leaf Blocks in the Index, Oracle expects to have to read only 6.

Best:: AccessPath: IndexRange Index: ACCOUNTS_TABLE_N1
Cost: 270.14 Degree: 1 Resp: 270.14 Card: 266.16 Bytes: 0


(When I trace an actual run of the query, I find that Oracle has to read 17 index blocks (root+branch+leaf) and 253 table blocks for a total number of 270 block gets to fetch 255 rows so Index Skip Scan costing isn't very accurate yet !). This test was on 10.2.0.4 with ASSM.


If I modify the column statistics to set NDV (Number of Distinct Values) to 659 for ACCOUNT_TYPE, Oracle switches to doing a FullTableScan with a cost of 917 ! If Oracle were to do a Skip Scan using the Index, it estimates a cost of 925 :

Access Path: index (skip-scan)
SS sel: 6.6415e-04 ANDV (#skips): 659
SS io: 659.00 vs. table scan io: 909.00
Skip Scan chosen
Access Path: index (SkipScan)
Index: ACCOUNTS_TABLE_N1
resc_io: 925.00 resc_cpu: 6691462
ix_sel: 6.6415e-04 ix_sel_with_filters: 6.6415e-04
Cost: 925.45 Resp: 925.45 Degree: 1



Therefore, an Index Skip Scan is usable if the number of distinct values for the leading column(s) ise sufficiently low that the number of "skips" in the Index does not become significant.

.
.
.

12 September, 2010

Deadlocks : 2 -- Deadlock on INSERT

Two days ago, I demonstrated a deadlock on UPDATE between two sessions.

Deadlocks are "generally" considered as Application Design issues. However, it is possible that deadlocks are not noticed when performance is fast enough and/or when the user/application "processing/wait" time between acquiring the first set of resources (row lock(s)) and the second set suddenly varies.


In this demo, I show how a deadlock can occur on INSERTs. (Yes, it may not be obvious that INSERTs, too, can cause deadlocks). Here we have a situation with two tables, each table having a primary key. We also have two sessions inserting the same (duplicated) values into the two tables. This is quite likely an Application Design issue -- the two sessions should not have been generating the same key values !

I first create the demo tables :

SQL> create table table_1 (col_1 number not null primary key, col_2 varchar2(5));

Table created.

SQL> insert into table_1 values (1,'ABC');

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> create table table_2 (col_1 number not null primary key, col_2 varchar2(5));

Table created.

SQL> insert into table_2 values (1,'ABC');

1 row created.

SQL> commit;

Commit complete.

SQL>


So we have two tables, each with a Primary Key defined. I have preloaded one row just to demonstrate that the two tables may have already existant rows before the "day of the deadlock".

Now, this is Session 1 :

23:14:22 SQL> select 'My Session : ' || s.sid, s.serial#, p.pid, p.spid
23:14:22 2 from v$session s, v$process p
23:14:22 3 where s.paddr=p.addr
23:14:22 4 and s.sid = (select distinct sid from v$mystat);

'MYSESSION:'||S.SID SERIAL# PID SPID
----------------------------------------------------- ---------- ---------- ------------
My Session : 151 18 15 4353

23:14:22 SQL>
23:14:22 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:14:34 SQL>
23:14:34 SQL>
23:14:34 SQL> -- my first set of row/rows that are locked
23:14:34 SQL> -- here it is an INSERT -- this will cause another session to wait on the Primary Key !
23:14:34 SQL> insert into table_1 values (2,'S1');

1 row created.

23:14:34 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:14:38 SQL>
23:14:38 SQL> -- subsequent operations
23:14:38 SQL> select count(*) from my_obj;

COUNT(*)
----------
50694

23:14:38 SQL>
23:14:38 SQL> select count(*) from table_1;

COUNT(*)
----------
2

23:14:38 SQL> select col_1, col_2 from table_1 where col_1=2;

COL_1 COL_2
---------- -----
2 S1

23:14:38 SQL>
23:14:38 SQL> -- my second set of row/rows that are locked
23:14:38 SQL> insert into table_2 values (2,'S1');
insert into table_2 values (2,'S1')
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource


23:14:42 SQL>
23:14:42 SQL> rollback;

Rollback complete.

23:14:47 SQL>


And this is Session 2 :

23:14:31 SQL> select 'My Session : ' || s.sid, s.serial#, p.pid, p.spid
23:14:31 2 from v$session s, v$process p
23:14:31 3 where s.paddr=p.addr
23:14:31 4 and s.sid = (select distinct sid from v$mystat);

'MYSESSION:'||S.SID SERIAL# PID SPID
----------------------------------------------------- ---------- ---------- ------------
My Session : 158 21 19 4358

23:14:31 SQL>
23:14:31 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:14:35 SQL>
23:14:35 SQL>
23:14:35 SQL> -- my first set of row/rows that are locked
23:14:35 SQL> -- here it is an INSERT -- this will cause another session to wait on the Primary Key !
23:14:35 SQL> insert into table_2 values (2,'S2');

1 row created.

23:14:35 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:14:39 SQL>
23:14:39 SQL> -- subsequent operations
23:14:39 SQL> select count(*) from my_obj;

COUNT(*)
----------
50694

23:14:40 SQL>
23:14:40 SQL> select count(*) from table_1;

COUNT(*)
----------
1

23:14:40 SQL> select col_1, col_2 from table_2 where col_1=2;

COL_1 COL_2
---------- -----
2 S2

23:14:40 SQL>
23:14:40 SQL> -- my second set of row/rows that are locked
23:14:40 SQL> insert into table_1 values (2,'S2');

1 row created.

23:14:47 SQL>
23:14:47 SQL> commit;

Commit complete.

23:14:50 SQL>


Thus, we can see that there is a deadlock between the two sessions, both attempting to insert the same key values into the two tables (in my demo above, the key value is "2", however, the value could have been, say "2" for table_1 and "200" for table_2).

As with the previous demo, Oracle detects the deadlock very quickly -- in 2 seconds. Oracle actually creates a deadlock graph and verifies it when it finds sessions waiting on locks.

Since Session 1 did a ROLLBACK in this demo, the values in the two tables reflect only Session 2's INSERTs. (i.e. Session 1's first INSERT got rolled back).


SQL> select col_1, col_2 from table_1;

COL_1 COL_2
---------- -----
1 ABC
2 S2

SQL> select col_1, col_2 from table_2;

COL_1 COL_2
---------- -----
1 ABC
2 S2

SQL>



Thus, it is possible to have Deadlocks on INSERTs if the Schema and Application design are not sturdy.

.
.
.

10 September, 2010

Deadlocks

UPDATE : See Deadlock on INSERT in the second post.

A deadlock occurs when two processes contend for the same resources (row locks) when one process holds a lock on one set of resources (call it set "A") and waits for a lock on another set of resources (call it set "B") while *at the same time* the other process holds a lock on set "B" and is waiting to lock set "A".
The set of resources could each be a single or or multiple rows but Oracle may detect the deadlock when attempting to lock a row in a set of rows.
When it detects the deadlock, it sends an error to be returned to the SQL and this causes a statement level rollback and not a transaction rollback. Also, the session isn't killed.
(This was contrary to my earlier understanding that the session, on the error, is killed !).

Here's a demo :
This is session_1 :

23:40:36 SQL> select 'My Session : ' || s.sid, s.serial#, p.pid, p.spid
23:40:36 2 from v$session s, v$process p
23:40:36 3 where s.paddr=p.addr
23:40:36 4 and s.sid = (select distinct sid from v$mystat);

'MYSESSION:'||S.SID SERIAL# PID SPID
----------------------------------------------------- ---------- ---------- ------------
My Session : 143 58 15 4170

23:40:36 SQL>
23:40:36 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:40:39 SQL>
23:40:39 SQL> select count(*) from table_1;

COUNT(*)
----------
4

23:40:39 SQL> select col_1, col_2 from table_1 where col_1='Z';

C COL_2
- ----------
Z 26

23:40:39 SQL>
23:40:39 SQL> -- my first set of row/rows that are locked
23:40:39 SQL> update table_1 set col_1 = 'A' where col_1='Z';

1 row updated.

23:40:39 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:40:44 SQL>
23:40:44 SQL> -- subsequent operations
23:40:44 SQL> select count(*) from my_obj;

COUNT(*)
----------
50694

23:40:44 SQL>
23:40:44 SQL> select count(*) from table_1;

COUNT(*)
----------
4

23:40:44 SQL> select col_1, col_2 from table_1 where col_1='X';

C COL_2
- ----------
X 24

23:40:44 SQL>
23:40:44 SQL> -- my first set of row/rows that are locked
23:40:44 SQL> update table_1 set col_1 = 'A' where col_1='X';
update table_1 set col_1 = 'A' where col_1='X'
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource


23:40:50 SQL>
23:40:50 SQL>
23:41:13 SQL> commit;

Commit complete.


And this is session_2 :

23:40:40 SQL>
23:40:40 SQL> select 'My Session : ' || s.sid, s.serial#, p.pid, p.spid
23:40:40 2 from v$session s, v$process p
23:40:40 3 where s.paddr=p.addr
23:40:40 4 and s.sid = (select distinct sid from v$mystat);

'MYSESSION:'||S.SID SERIAL# PID SPID
----------------------------------------------------- ---------- ---------- ------------
My Session : 145 58 20 4172

23:40:40 SQL>
23:40:40 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:40:42 SQL>
23:40:42 SQL> select count(*) from table_1;

COUNT(*)
----------
4

23:40:42 SQL> select col_1, col_2 from table_1 where col_1='X';

C COL_2
- ----------
X 24

23:40:42 SQL>
23:40:42 SQL> -- my first set of row/rows that are locked
23:40:42 SQL> update table_1 set col_1 = 'B' where col_1='X';

1 row updated.

23:40:42 SQL> pause Press ENTER to proceed .....
Press ENTER to proceed .....

23:40:47 SQL>
23:40:47 SQL> -- subsequent operations
23:40:47 SQL> select count(*) from
23:40:47 2 ( select /*+ NO_MERGE */ * from my_obj_4 union all select * from my_obj_2 );

COUNT(*)
----------
101391

23:40:47 SQL>
23:40:47 SQL> select count(*) from table_1;

COUNT(*)
----------
4

23:40:47 SQL> select col_1, col_2 from table_1 where col_1='X';

no rows selected

23:40:47 SQL>
23:40:47 SQL> -- my second set of row/rows that are locked
23:40:47 SQL> update table_1 set col_1 = 'B' where col_1='Z';

0 rows updated.

23:41:27 SQL>
23:41:27 SQL> commit;

Commit complete.



The data at the beginning was :

C COL_2
- ----------
A 1
B 2
X 24
Z 26


while that at the end was :

C COL_2
- ----------
A 1
B 2
B 24
A 26


Thus the first UPDATE by each session did go through ! (session_2's second UPDATE found 0 rows because session_1 had modified col_1='Z' to col_1='A').

NOTE : The queries on "my_obj", "my_obj_4", "my_obj_2" are to simulate other SQL operations done by each session betwen the time of locking one set of resources (a row in table_1) and requesting another set of resources (another row in table_1). These "operations" may be other queries run by the sessions OR they may be other DML run by the sessions OR they may just be the sessions waiting for the user or application server to issue the next command (in which case, the "performance" of the user or application before it issues the next command may need to be reviewed).
Thus, a deadlock may occur because of performance issues in operations not even related to the row locks. Under normal circumstances, it may so happen, that the two sessions don't attempt these resources at nearly the same time *OR* even that the "other operations" are fast enough such that the two sessions don't end up contending for the resources -- e.g. if session_1's operations on both sets of resources (rows in table_1) complete even before session_2 has taken a lock on it's first set of resources.


The deadlock trace for the first session (SID 143) shows how the two sessions (SID 143 and 145, PIDs 15 and 20 respectively) are contending on the same object (OBJN=57698 being "TABLE_1")) but each holding one row and waiting for another row (both being in the same block 865780 as this is a small table). {That the rows are in the same block is *not* the cause of the deadlock as Oracle does row-level locking using ITL entries.


Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-00130000-000001b8 15 143 X 20 145 X
TX-00110006-0000013c 20 145 X 15 143 X
session 143: DID 0001-000F-00000029 session 145: DID 0001-0014-00000019
session 145: DID 0001-0014-00000019 session 143: DID 0001-000F-00000029
Rows waited on:
Session 145: obj - rowid = 0000E162 - AAAOFmAAEAADTX0AAD
(dictionary objn - 57698, file - 4, block - 865780, slot - 3)
Session 143: obj - rowid = 0000E162 - AAAOFmAAEAADTX0AAC
(dictionary objn - 57698, file - 4, block - 865780, slot - 2)
Information on the OTHER waiting sessions:
Session 145:
pid=20 serial=58 audsid=584830 user: 64/HEMANT
O/S info: user: ora10204, term: pts/2, ospid: 4167, machine: linux64
program: sqlplus@linux64 (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
Current SQL Statement:
update table_1 set col_1 = 'B' where col_1='Z'
End of information on OTHER waiting sessions.

.
.
.