Hemant's Oracle DBA Blog: An Index that is a "subset" of a pre-existing Index

It is a common [mis]understanding that creating a new Index that is a subset of an existing Index is an unnecessary overhead. The argument goes "if you have an Index on (A,B,C) this Index can be used to satisfy queries where predicate 'A' is specified, so creating an additional Index on (A) alone is an unnecessary overhead".
It is true that the the Index could be used, but the Cost Based Optimizer also computes a "COST" for using the Index. A composite Index is larger (in terms of Leaf Blocks to navigate) then a single-column Index. Similarly, at times, the composite Index becomes so large that many queries are executed as FullTableScans !

This question arose in a forums discussion today on how a CREATE INDEX an be speeded up. The CREATion of a Index on (A) can actually read the existing Index on (A,B,C) without having to do a FullTableScan. Furthermore, the new Index on (A) can satisfy queries on the single predicate better than the existing Index.

Here I demonstrate both cases :
1. A CREATE INDEX reading an existing Index and avoiding a FullTableScan
2. An Index that is a "subset" of an existing Index being useful

I first setup the "demo" table

SQL> drop table objects_list purge;

Table dropped.

SQL> -- assume a "largeish" table -- the Remarks column makes the row length greater
SQL> create table objects_list as
2  select owner, object_name, subobject_name,  object_type, created, status,
3  dbms_random.string('X',200) as Remarks
4  from dba_objects
5  /

Table created.

SQL>
SQL> -- in the absence of object_id, these columns constitute the Unique Index
SQL> create unique index objects_list_u1 on objects_list (owner, object_name, subobject_name, object_type);

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('','OBJECTS_LIST',estimate_percent=>100,cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL>
SQL> select blocks, num_rows from user_tables where table_name = 'OBJECTS_LIST';

  BLOCKS   NUM_ROWS
---------- ----------
    2867      76955

SQL>

Now, I verify if I can use the composite Index to execute a query on the leading column alone :

SQL> --- we should be able to use the Unique Index when querying for the leading column alone
SQL> explain plan for
2  select owner, count(*) from objects_list where owner like 'O%' group by owner order by owner;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 544682346

----------------------------------------------------------------------------------------
| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |     3 |    21 |    39   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|                 |     3 |    21 |    39   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN   | OBJECTS_LIST_U1 |  5141 | 35987 |    39   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("OWNER" LIKE 'O%')
     filter("OWNER" LIKE 'O%')

15 rows selected.

SQL> select owner, count(*) from objects_list where owner like 'O%' group by owner order by owner;

OWNER                            COUNT(*)
------------------------------ ----------
OBE                                    75
OE                                     84
OE1                                    55
OLAPSYS                               719
ORACLE_OCM                              8
ORDDATA                               248
ORDPLUGINS                             10
ORDSYS                               2532
OUTLN                                   9
OWBSYS                                  2
OWBSYS_AUDIT                           12

11 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 544682346

----------------------------------------------------------------------------------------
| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |     3 |    21 |    39   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|                 |     3 |    21 |    39   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN   | OBJECTS_LIST_U1 |  5141 | 35987 |    39   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("OWNER" LIKE 'O%')
     filter("OWNER" LIKE 'O%')


Statistics
----------------------------------------------------------
        0  recursive calls
        0  db block gets
       30  consistent gets
        0  physical reads
        0  redo size
      676  bytes sent via SQL*Net to client
      419  bytes received via SQL*Net from client
        2  SQL*Net roundtrips to/from client
        0  sorts (memory)
        0  sorts (disk)
       11  rows processed

SQL>

Yes. The composite Index is used even for the single column predicate "OWNER". The query does 30 block gets. (I've actually executed the query twice to eliminate recursive calls at the second execution).

What if I created an Index on "OWNER" alone ? It would be a "subset of the existing Index.

SQL> create index objects_list_n1 on objects_list (owner);

Index created.

SQL>
SQL> --- now Oracle can use the leaner index for the same query
SQL> --- check the "Cost"
SQL> explain plan for
2  select owner, count(*) from objects_list where owner like 'O%' group by owner order by owner;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1721948621

----------------------------------------------------------------------------------------
| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |     3 |    21 |    14   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|                 |     3 |    21 |    14   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN   | OBJECTS_LIST_N1 |  5141 | 35987 |    14   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("OWNER" LIKE 'O%')
     filter("OWNER" LIKE 'O%')

15 rows selected.

SQL>
SQL> select owner, count(*) from objects_list where owner like 'O%' group by owner order by owner;

OWNER                            COUNT(*)
------------------------------ ----------
OBE                                    75
OE                                     84
OE1                                    55
OLAPSYS                               719
ORACLE_OCM                              8
ORDDATA                               248
ORDPLUGINS                             10
ORDSYS                               2532
OUTLN                                   9
OWBSYS                                  2
OWBSYS_AUDIT                           12

11 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1721948621

----------------------------------------------------------------------------------------
| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |     3 |    21 |    14   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|                 |     3 |    21 |    14   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN   | OBJECTS_LIST_N1 |  5141 | 35987 |    14   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("OWNER" LIKE 'O%')
     filter("OWNER" LIKE 'O%')


Statistics
----------------------------------------------------------
        0  recursive calls
        0  db block gets
       12  consistent gets
        0  physical reads
        0  redo size
      676  bytes sent via SQL*Net to client
      419  bytes received via SQL*Net from client
        2  SQL*Net roundtrips to/from client
        0  sorts (memory)
        0  sorts (disk)
       11  rows processed

SQL>

The new Index is used ! It has a lower "COST" (14 for the new Index versus 39 for the composite Index) and doing fewer block gets (12 versus 30).

So, this new Index is better than the existing one for such queries.

How was the Index created ? I actually obtained a trace of the CREATE INDEX run.

LOCK TABLE "OBJECTS_LIST" IN SHARE MODE  NOWAIT

create index objects_list_n1 on objects_list (owner)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          1           0
Execute      1      0.27       0.29          1      77000        714           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.27       0.29          1      77002        715           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 187

Rows     Row Source Operation
-------  ---------------------------------------------------
    1  INDEX BUILD NON UNIQUE OBJECTS_LIST_N1 (cr=77119 pr=1 pw=184 time=0 us)(object id 0)
76955   SORT CREATE INDEX (cr=76965 pr=1 pw=0 time=857870 us)
76955    INDEX FAST FULL SCAN OBJECTS_LIST_U1 (cr=76965 pr=1 pw=0 time=1649397 us)(object id 88291)
Thus the CREATE INDEX OBJECTS_LIST_N1 actually did an Index Fast Full Scan of the existing Index OBJECTS_LIST_U1 !

.
.
.

4 comments:

Anonymous said...: Hi Hemant. Interesting article. This means that CBO decides to make an index fast full scan on already created table instead querying the original table, because the first one will be faster

Thanks for sharing!; 22 September, 2011
Facebook Apps said...: This comment has been removed by a blog administrator.; 23 November, 2011
Brooke Kay said...: Marvelous work.Just wanted to drop a comment and say I am new to your blog and really like what I am reading.; 04 July, 2012
robertson said...: Lighter index is always used in comparison to heavier index(like composite index with more columns). Example and Concept is 100% correct.

Not an suggestible option to have multiple indexes ( Composite/single indexes on same column) unless there are daily frequent queries having where clause on single column as well other queries where clause having clause on composite columns (consituting the single column too) with pretty tight response times in milliseconds.

Lets say, there are 22 crore records joining another table of 22 crores ( real time example)... We have 30 indexes of which 18 are lighter indexes (subset index of composite index)... Example: Subset means.. index on status field, composite means index on status and merchant_id and txn_date field.
A Settlement batch job inserts 3+ lakh records sequentially one after the other at end of day along with multiple checks with other referencing tables...

"Will this not lead to growth of all the 30 indexes on the table, hence slowening the INSERTS" ( 5 millisecond per row insert leads to 200+ millisecond per row insert, as all Indexes needs to grow ).

In this case, we prefer to have required composite indexes only ( 12 in our case instead of 30 ) for speedening up the INSERTS. We preferred to drop subset indexes, which speedened our INSERTS.. Approx...Job run time of 4.5 hrs reduces drastically to 2 hrs 45 mins.. ! My 2 cents.. ! Robertson Bhadrachalam; 09 September, 2019

Hemant's Oracle DBA Blog

21 September, 2011

An Index that is a "subset" of a pre-existing Index

4 comments:

Aggregated by orafaq.com

OCP 12c