27 February, 2010

Something Unique about Unique Indexes

Recently, a forums posting how it could be that a concatenated index on two columns could fail to be Unique while an index on the first column alone *was* Unique.
It doesn't sound logical does it ? If column_A is unique, surely (column_A, column_B) together should also be unique !

Not always.

See :


SQL> create unique index sales_orders_u1 on sales_orders(order_no);

Index created.

SQL> drop index sales_orders_u1;

Index dropped.

SQL> create unique index sales_orders_u2 on sales_orders(order_no,year_month);
create unique index sales_orders_u2 on sales_orders(order_no,year_month)
*
ERROR at line 1:
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found


SQL>


Why does index sales_orders_u1 get created as Unique Index but sales_orders_u2, with an additional column, fail ?

Think about it for a while ..........
.
.
.
.
.
.


OK. Here's how I setup the table :

SQL> drop table sales_orders purge;

Table dropped.

SQL> create table sales_orders
2 (order_no number, year_month varchar2(7) not null,
3 product_code number, quantity number, description varchar2(50));

Table created.

SQL>
SQL> insert into sales_orders values (1,'2009JAN',1001,50,'A Sale');

1 row created.

SQL> insert into sales_orders values (2,'2009FEB',2001,10,'Another Sale');

1 row created.

SQL> insert into sales_orders values (NULL,'2009MAR',1001,60,'Third Sale');

1 row created.

SQL> insert into sales_orders values (NULL,'2009MAR',2001,80,'Fourth Sale');

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> create unique index sales_orders_u1 on sales_orders(order_no);

Index created.

SQL> drop index sales_orders_u1;

Index dropped.

SQL> create unique index sales_orders_u2 on sales_orders(order_no,year_month);
create unique index sales_orders_u2 on sales_orders(order_no,year_month)
*
ERROR at line 1:
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found


SQL>


What was (er..) unique about the pairing of the two columns ?

Have you caught the reason ?
If not, think for some more time .....

The issue arises from Oracle's handling of NULLs (as in the first column order_no which has 2 rows with NULL values).
A NULL in Oracle is "indeterminate". It is an unknown value. Therefore, two NULLs are never equal.
The Unique Index sales_orders_u1 on order_no alone succeeds because the NULL values in the two rows are not compared as they are not equal to each other (in fact, the two rows are not even captured in the index !).
However, when I create Unique Index sales_orders_u2 with an additional column, Oracle actually compares the values in the second column. Although it would seem that if two NULLs are not alike, then we could afford to ignore comparing the second column. However, Oracle does not ignore the values that are not NULLs. If it finds duplicates in these values (as it finds in year_month), the test for Uniqueness fails.
Update : If the second column, too, has NULLs and there happen(s) to be any row(s) where *both* the first column (order_no) and the second column (year_month) contain a NULL, that/those rows would not be included in the index and would pass the "Uniqueness" test as Oracle cannot compare rows with all NULLs.

The forums posting is here.

This behaviour is documented. Rather than pointing you to the documentation, I'll provide a few other references :

1. Tom Kyte's post "Something about nothing".
2. His follow-up post "Mull about null".
3. My post "Nulls are Indexed, Right ? NOT !"
4. My post "Indexed column (unique or not) -- What if it is nullable".

(I've also touched on NULLs in a few other posts -- e.g. "The difference between NOT IN and NOT EXISTS" and "Counting the rows in a Table".

.
.
.

3 comments:

DomBrooks said...

> The Unique Index sales_orders_u1 on order_no alone succeeds because the NULL values in the two rows are not equal to each other

Surely, it is that there is no null entry for the unique index? And not that null = null is false?

Whereas if there is a composite index (with a not null second column), then there is a null entry for the first column entry in the index.

If the second column was also null in more than one entry then that too would be allowable.

Hemant K Chitale said...

DomBrooks,
1. "Surelyt it is that there is no null entry for the unique index".
Yes. That is why I did add "(in fact, the two rows are not even captured in the index !)..
2. "And not that null = null is false".
It is because you cannot compare nulls that nulls are not indexed.
3. "Whereaas if there is a composite index (with a not null second colunmn), then there isa null entry for the first column entry in the index".
I agree. I don't say that index "u2" doesn't include the nulls. But Oracle effectively compares the second value in the index.
4. "If the second column was also null in more than one entry than that too would be allowable".
Yes. Because those two nulls in the second column would, like the two nulls in the first column, not be included in the index.

Hemant

Hemant K Chitale said...

DomBrooks,
I have updated my post to change the line "The Unique Index sales_orders_u1 on order_no alone succeeds because the NULL values in the two rows are not equal to each other (in fact, the two rows are not even captured in the index !)" to "The Unique Index sales_orders_u1 on order_no alone succeeds because the NULL values in the two rows are not compared as they are not equal to each other (in fact, the two rows are not even captured in the index !)"
I have also added an update for the case where the second column may also have NULL value(s) for the same row(s) where the first column has NULL value(s).

Hemant