29 July, 2011

More on COUNT()s

As a followup to my previous post "Running a COUNT(column) versus COUNT(*)", here's a demo of :
a. COUNT(column)
b. COUNT(constant)
c. COUNT(*)
d. COUNT(*) done via an Index

SQL> -- Create a test table with 3 columns
SQL> -- The first column happens to have a NULL value for every 10th row
SQL> -- Column_2 contains numbers, all greater than 0
SQL> -- so a query for "WHERE Column_2 > 0" retrieves ALL the rows
SQL> -- Column_3 contains a string
SQL>
SQL> create table COUNT_ROWS_TBL (Column_1 number, Column_2 number, Column_3 varchar2(100)) ;

Table created.

SQL> insert into COUNT_ROWS_TBL
2 select decode(mod(rownum,10),0,NULL,rownum) as Column_1,
3 rownum as Column_2,
4 'Column_3_Values' || dbms_random.string('X',80) as Column_3
5 from dual
6 connect by level < 1001
7 /

1000 rows created.

SQL> -- the table has 1000 rows, as evidenced by the output above
SQL> -- verify this :
SQL> select count(*) from COUNT_ROWS_TBL;

COUNT(*)
----------
1000

SQL>
SQL> -- First example : COUNT(column) doesn't count all the rows
SQL> -- if the column has a NULL in one or more rows
SQL>
SQL> -- note how count(Column_1) excludes rows with a NULL
SQL> select count(Column_1) from COUNT_ROWS_TBL ;

COUNT(COLUMN_1)
---------------
900

SQL> select count(Column_1) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(COLUMN_1)
---------------
900

SQL> -- this is the count of rows that the query for count(Column_1) excludes
SQL> select count(*) from COUNT_ROWS_TBL where Column_1 is NULL;

COUNT(*)
----------
100

SQL> select count(Column_2) from COUNT_ROWS_TBL;

COUNT(COLUMN_2)
---------------
1000

SQL> select count(Column_2) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(COLUMN_2)
---------------
1000

SQL> select count(Column_3) from COUNT_ROWS_TBL;

COUNT(COLUMN_3)
---------------
1000

SQL> select count(Column_3) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(COLUMN_3)
---------------
1000

SQL>
SQL>
SQL> -- Second example : COUNT(1) does count all the rows
SQL> -- because 1 is a constant, not-NULL, value
SQL> select count(1) from COUNT_ROWS_TBL ;

COUNT(1)
----------
1000

SQL> select count(1) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(1)
----------
1000

SQL> select count(1) from COUNT_ROWS_TBL;

COUNT(1)
----------
1000

SQL> select count(1) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(1)
----------
1000

SQL> select count(1) from COUNT_ROWS_TBL;

COUNT(1)
----------
1000

SQL> select count(1) from COUNT_ROWS_TBL where Column_2 > 0;

COUNT(1)
----------
1000

SQL>
SQL>
SQL> -- Third example : COUNT(*) might use an Index
SQL> -- If the index is on a column that is guaranteed to not contain NULLs
SQL>
SQL> set autotrace on
SQL> select count(*) from COUNT_ROWS_TBL;

COUNT(*)
----------
1000


Execution Plan
----------------------------------------------------------
Plan hash value: 3695206450

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| COUNT_ROWS_TBL | 1000 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
23 consistent gets
0 physical reads
0 redo size
422 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> exec dbms_stats.gather_table_stats('','COUNT_ROWS_TBL',estimate_percent=>100,cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> select count(*) from COUNT_ROWS_TBL;

COUNT(*)
----------
1000


Execution Plan
----------------------------------------------------------
Plan hash value: 3695206450

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| COUNT_ROWS_TBL | 1000 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
23 consistent gets
0 physical reads
0 redo size
422 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> alter table COUNT_ROWS_TBL modify (Column_2 NOT NULL);

Table altered.

SQL> create index COUNT_ROWS_TBL_NDX on COUNT_ROWS_TBL(Column_2);

Index created.

SQL> -- NOW ! the Index will be used !
SQL> select count(*) from COUNT_ROWS_TBL;

COUNT(*)
----------
1000


Execution Plan
----------------------------------------------------------
Plan hash value: 1131752359

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| COUNT_ROWS_TBL_NDX | 1000 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
3 physical reads
0 redo size
422 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> -- note how Oracle chose to use the Index now
SQL> -- Remember : It will use the Index IF :
SQL> --- a. The Indexed Column
SQL> (or at least one column in a composite index)
2 is a NOT NULL column
3 --- b. The calculated "Cost" (determined by the Query optimizer)
4 --- is lower for an Index [Fast] Full Scan than for a Full Table Scan
5
SQL>



So, a COUNT(column) and a COUNT(constant) and a COUNT(*) do not necessarily mean the same thing.


UPDATE :
See the subsequent post
.
.
.

No comments: