Hemant's Oracle DBA Blog: Counting the Rows in a Table

26 August, 2009

Counting the Rows in a Table

Tanel Poder points out how COUNT(*) and COUNT(column) are not necessarily the same. A COUNT on a Column ignores NULL values in the column !

Here, I conduct a few more tests.


SQL> create table test_count (col_1 varchar2(5), col_2 varchar2(5), col_3 varchar2(5));

Table created.

SQL> insert into test_count values ('a','first','1');

1 row created.

SQL> insert into test_count values ('b',null,'2');

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from test_count;

COUNT(*)
----------
   2

SQL> select count(col_1) from test_count;

COUNT(COL_1)
------------
     2

SQL> select count(col_2) from test_count;

COUNT(COL_2)
------------
     1

SQL>

Although the table has 2 rows, since one of the rows is a NULL in COL_2, a count on COL_2 misses that row.


SQL> insert into test_count values ('',NULL,'');

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from test_count;

COUNT(*)
----------
    3

SQL> select count(col_1) from test_count;

COUNT(COL_1)
------------
      2

SQL> select count(col_2) from test_count;

COUNT(COL_2)
------------
      1

SQL>

I inserted a row with all NULLs. Oracle allows me to insert an ALL NULL row. Now, a count on any column returns incorrect results. Only a COUNT(*) is correct.

Now, I proceed to another test. Here I create a larger table.


SQL> select count(*) from dba_objects where object_id is null;

COUNT(*)
----------
     1

SQL> create table another_test_count as select owner, object_id, object_type from dba_objects;

Table created.

SQL> select count(*) from another_test_count;

COUNT(*)
----------
 50628

SQL> select count(owner) from another_test_count;

COUNT(OWNER)
------------
   50628

SQL> select count(object_id) from another_test_count;

COUNT(OBJECT_ID)
----------------
       50627

SQL>

As I expected, a COUNT on OBJECT_ID is one row short as one of the OBJECT_IDs is NULL.

I now create an Index on the table.


SQL> create index another_test_count_ndx on another_test_count(object_id,owner);   

Index created.

SQL> exec dbms_stats.gather_table_stats('','ANOTHER_TEST_COUNT',estimate_percent=>100,cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL>

Now, I attempt to use the Index to count the number of rows. The index is smaller than the table so an INDEX FAST FULL SCAN should be preferred over a FULL TABLE SCAN.


SQL> select count(*) from another_test_count;

COUNT(*)
----------
   50628


Execution Plan
----------------------------------------------------------
Plan hash value: 3536081126

---------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |     1 |    52   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE    |                    |     1 |            |          |
|   2 |   TABLE ACCESS FULL| ANOTHER_TEST_COUNT | 50628 |    52   (2)| 00:00:01 |
---------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
        0  recursive calls
        0  db block gets
      173  consistent gets
        0  physical reads
        0  redo size
      517  bytes sent via SQL*Net to client
      492  bytes received via SQL*Net from client
        2  SQL*Net roundtrips to/from client
        0  sorts (memory)
        0  sorts (disk)
        1  rows processed

SQL> alter table another_test_count modify (owner not null);

Table altered.

SQL> select count(*) from another_test_count;

 COUNT(*)
----------
    50628


Execution Plan
----------------------------------------------------------
Plan hash value: 227569207

----------------------------------------------------------------------------------------
| Id  | Operation             | Name                   | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                        |     1 |    43   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |                        |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| ANOTHER_TEST_COUNT_NDX | 50628 |    43   (0)| 00:00:01 |
----------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
         0  recursive calls
         0  db block gets
       158  consistent gets
         0  physical reads
         0  redo size
       517  bytes sent via SQL*Net to client
       492  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
         1  rows processed

SQL>

In my first pass at a COUNT(*), Oracle does not use the Index even though it is smaller than the table. Why not ? Because Oracle cannot be sure that the Index captures every ROWID from the table. This is the case where, for a concatenated index, every column has a NULL for a particular row, resulting in that row being "excluded" from the Index.

As soon as I change the OWNER column (which isn't even the leading column of the Index) to a NOT NULL, the Optimizer is assured that an Index on this column *will* capture every ROWID. At the next COUNT(*), the Optimizer prefers to do an INDEX FAST FULL SCAN !

(Note : I had run the "SELECT COUNT(*) FROM ANOTHER_TEST_COUNT" twice before altering OWNER to NOT NULL and twice again after altering it to NOT NULL. I have not reported the autotrace results of each first run as it includes Parse overheads -- the recursive calls inflating the 'consistent gets' count).

.
.
.

3 comments:

Anand said...: Hi Sir,

Nice Post.Especially the last section where "Optimizer is assured that an Index on this column *will* capture every ROWID as its NOT NULL" is nice example.

Regards,
Anand; 27 August, 2009
Uwe Hesse said...: Hemant, thank you for pointing that out. In addition, I'd like to mention that the myth "Count(column) is better then count(*)"
results from the observation of doing a count(column) on an indexed nullable column. In this case, count(column) will lead to an index fast full scan, really faster than a full table scan probably. But potentially with the wrong result! If there would have been an index on a not null column present, count(*) would use it (as you pointed out).

Kind regards
Uwe; 31 August, 2009
Hemant K Chitale said...: Uwe,

Yes, the myth has probably led a number of people to use count(column) for the wrong columns all too frequently !; 31 August, 2009

Hemant's Oracle DBA Blog

26 August, 2009

Counting the Rows in a Table

3 comments:

Aggregated by orafaq.com

OCP 12c