26 August, 2009

Counting the Rows in a Table

Tanel Poder points out how COUNT(*) and COUNT(column) are not necessarily the same. A COUNT on a Column ignores NULL values in the column !

Here, I conduct a few more tests.



SQL> create table test_count (col_1 varchar2(5), col_2 varchar2(5), col_3 varchar2(5));

Table created.

SQL> insert into test_count values ('a','first','1');

1 row created.

SQL> insert into test_count values ('b',null,'2');

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from test_count;

COUNT(*)
----------
2

SQL> select count(col_1) from test_count;

COUNT(COL_1)
------------
2

SQL> select count(col_2) from test_count;

COUNT(COL_2)
------------
1

SQL>

Although the table has 2 rows, since one of the rows is a NULL in COL_2, a count on COL_2 misses that row.


SQL> insert into test_count values ('',NULL,'');

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from test_count;

COUNT(*)
----------
3

SQL> select count(col_1) from test_count;

COUNT(COL_1)
------------
2

SQL> select count(col_2) from test_count;

COUNT(COL_2)
------------
1

SQL>

I inserted a row with all NULLs. Oracle allows me to insert an ALL NULL row. Now, a count on any column returns incorrect results. Only a COUNT(*) is correct.


Now, I proceed to another test. Here I create a larger table.

SQL> select count(*) from dba_objects where object_id is null;

COUNT(*)
----------
1

SQL> create table another_test_count as select owner, object_id, object_type from dba_objects;

Table created.

SQL> select count(*) from another_test_count;

COUNT(*)
----------
50628

SQL> select count(owner) from another_test_count;

COUNT(OWNER)
------------
50628

SQL> select count(object_id) from another_test_count;

COUNT(OBJECT_ID)
----------------
50627

SQL>

As I expected, a COUNT on OBJECT_ID is one row short as one of the OBJECT_IDs is NULL.

I now create an Index on the table.

SQL> create index another_test_count_ndx on another_test_count(object_id,owner);

Index created.

SQL> exec dbms_stats.gather_table_stats('','ANOTHER_TEST_COUNT',estimate_percent=>100,cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL>


Now, I attempt to use the Index to count the number of rows. The index is smaller than the table so an INDEX FAST FULL SCAN should be preferred over a FULL TABLE SCAN.


SQL> select count(*) from another_test_count;

COUNT(*)
----------
50628


Execution Plan
----------------------------------------------------------
Plan hash value: 3536081126

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 (2)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| ANOTHER_TEST_COUNT | 50628 | 52 (2)| 00:00:01 |
---------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
173 consistent gets
0 physical reads
0 redo size
517 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> alter table another_test_count modify (owner not null);

Table altered.

SQL> select count(*) from another_test_count;

COUNT(*)
----------
50628


Execution Plan
----------------------------------------------------------
Plan hash value: 227569207

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 43 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| ANOTHER_TEST_COUNT_NDX | 50628 | 43 (0)| 00:00:01 |
----------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
158 consistent gets
0 physical reads
0 redo size
517 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>



In my first pass at a COUNT(*), Oracle does not use the Index even though it is smaller than the table. Why not ? Because Oracle cannot be sure that the Index captures every ROWID from the table. This is the case where, for a concatenated index, every column has a NULL for a particular row, resulting in that row being "excluded" from the Index.

As soon as I change the OWNER column (which isn't even the leading column of the Index) to a NOT NULL, the Optimizer is assured that an Index on this column *will* capture every ROWID. At the next COUNT(*), the Optimizer prefers to do an INDEX FAST FULL SCAN !

(Note : I had run the "SELECT COUNT(*) FROM ANOTHER_TEST_COUNT" twice before altering OWNER to NOT NULL and twice again after altering it to NOT NULL. I have not reported the autotrace results of each first run as it includes Parse overheads -- the recursive calls inflating the 'consistent gets' count).

.
.
.

3 comments:

Anand said...

Hi Sir,

Nice Post.Especially the last section where "Optimizer is assured that an Index on this column *will* capture every ROWID as its NOT NULL" is nice example.


Regards,
Anand

Uwe Hesse said...

Hemant, thank you for pointing that out. In addition, I'd like to mention that the myth "Count(column) is better then count(*)"
results from the observation of doing a count(column) on an indexed nullable column. In this case, count(column) will lead to an index fast full scan, really faster than a full table scan probably. But potentially with the wrong result! If there would have been an index on a not null column present, count(*) would use it (as you pointed out).

Kind regards
Uwe

Hemant K Chitale said...

Uwe,

Yes, the myth has probably led a number of people to use count(column) for the wrong columns all too frequently !