Till a few days ago, I thought that, since NOT NULL column have only Unique Values would not need a Histogram, Oracle would not generate a Histogram, there being no data skew when no value is repeated.
However,
Randolf Geist, in a recent forums.oracle.com thread, suggested that it is possible for Oracle's "auto" behaviour to generate a Histogram on such data where there is a noticeable gap in the data. That seems to be a result of the way the skewness detection algorith works.
So I set out to test it. This quick test is on 10.2.0.4 (64bit Linux). I have 105,329 unique values ranging from "2" to "259290". I have some gaps in the values, but I ensure that Uniqueness is maintained.
SQL>
SQL> drop table t_ah_o_g_u ;
Table dropped.
SQL> create table t_ah_o_g_u (gapped_uniq_number number not null, a_string varchar2(128));
Table created.
SQL>
SQL> create unique index t_ah_o_g_u_uk on t_ah_o_g_u (gapped_uniq_number);
Index created.
SQL>
SQL> REM Insert Unique Values but with significant gaps between Ranges
SQL> insert into t_ah_o_g_u select object_id, object_name from dba_objects where object_id between 2 and 20;
19 rows created.
SQL> insert into t_ah_o_g_u select object_id, object_name from dba_objects where object_id between 30000 and 32000;
2001 rows created.
SQL> insert into t_ah_o_g_u select object_id, object_name from dba_objects where object_id > 50000;
2087 rows created.
SQL> insert into t_ah_o_g_u select object_id+100000, object_name from dba_objects ;
50611 rows created.
SQL> insert into t_ah_o_g_u select object_id+200000, object_name from dba_objects ;
50611 rows created.
SQL>
SQL>
SQL> select count(*) from t_ah_o_g_u;
COUNT(*)
----------
105329
SQL> REM let's check the skew
SQL> select min(gapped_uniq_number), max(gapped_uniq_number) from t_ah_o_g_u;
MIN(GAPPED_UNIQ_NUMBER) MAX(GAPPED_UNIQ_NUMBER)
----------------------- -----------------------
2 259290
SQL> select count(*) from t_ah_o_g_u where gapped_uniq_number < 50000 ;
COUNT(*)
----------
2020
SQL> select count(*) from t_ah_o_g_u where gapped_uniq_number between 50000 and 100000;
COUNT(*)
----------
2087
SQL> select count(*) from t_ah_o_g_u where gapped_uniq_number between 100000 and 200000;
COUNT(*)
----------
50611
SQL> select count(*) from t_ah_o_g_u where gapped_uniq_number > 200000;
COUNT(*)
----------
50611
SQL>
SQL> REM =============================================================================================
SQL> REM =============================================================================================
SQL> REM We know that where the column is a Unique Key, there cn be no data skew -- and we need no histogram
SQL> REM Does Oracle's autogather create a histogram ?
SQL>
SQL> REM =============================================================================================
SQL> REM with SIZE AUTO
SQL> exec dbms_stats.gather_table_stats(user,'T_AH_O_G_U',method_opt=>'FOR ALL COLUMNS SIZE AUTO', estimate_percent=>100, cascade=>TRUE);
PL/SQL procedure successfully completed.
SQL>
SQL> REM Let's look for histograms
SQL> select num_rows, sample_size
2 from user_tables where table_name = 'T_AH_O_G_U';
NUM_ROWS SAMPLE_SIZE
---------- -----------
105329 105329
SQL> select low_value, high_value, num_distinct,num_buckets,sample_size, histogram
2 from user_tab_col_statistics where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
LOW_VALUE HIGH_VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------ ----------- ----------- ---------------
C103 C31A5D5B
105329 254 105329 HEIGHT BALANCED
SQL> select count(*)
2 from user_tab_histograms where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
COUNT(*)
----------
255
SQL> REM =============================================================================================
SQL> REM =============================================================================================
SQL> REM !!!! Yes, the SIZE AUTO did create a Height Balanced Histogram with 254 buckets !
SQL> REM =============================================================================================
SQL> REM =============================================================================================
SQL>
SQL> REM =============================================================================================
SQL> REM with SIZE SKEWONLY
SQL> exec dbms_stats.delete_table_stats(user,'T_AH_O_G_U');
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'T_AH_O_G_U',method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY', estimate_percent=>100, cascade=>TRUE);
PL/SQL procedure successfully completed.
SQL>
SQL> REM Let's look for histograms
SQL> select num_rows, sample_size
2 from user_tables where table_name = 'T_AH_O_G_U';
NUM_ROWS SAMPLE_SIZE
---------- -----------
105329 105329
SQL> select low_value, high_value, num_distinct,num_buckets,sample_size, histogram
2 from user_tab_col_statistics where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
LOW_VALUE HIGH_VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------ ----------- ----------- ---------------
C103 C31A5D5B
105329 254 105329 HEIGHT BALANCED
SQL> select count(*)
2 from user_tab_histograms where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
COUNT(*)
----------
255
SQL> REM =============================================================================================
SQL> REM =============================================================================================
SQL> REM !!!! Yes, the SKEWONLY did create a Height Balanced Histogram with 254 buckets !
SQL> REM =============================================================================================
SQL> REM =============================================================================================
SQL>
SQL>
SQL>
SQL> REM =============================================================================================
SQL> REM with SIZE 1 -- reverting to the 9i default behaviour
SQL> exec dbms_stats.delete_table_stats(user,'T_AH_O_G_U');
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'T_AH_O_G_U',method_opt=>'FOR ALL COLUMNS SIZE 1', estimate_percent=>100, cascade=>TRUE);
PL/SQL procedure successfully completed.
SQL>
SQL> REM Let's look for histograms
SQL> select num_rows, sample_size
2 from user_tables where table_name = 'T_AH_O_G_U';
NUM_ROWS SAMPLE_SIZE
---------- -----------
105329 105329
SQL> select low_value, high_value, num_distinct,num_buckets,sample_size, histogram
2 from user_tab_col_statistics where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
LOW_VALUE HIGH_VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------ ----------- ----------- ---------------
C103 C31A5D5B
105329 1 105329 NONE
SQL> select count(*)
2 from user_tab_histograms where table_name = 'T_AH_O_G_U' and column_name = 'GAPPED_UNIQ_NUMBER';
COUNT(*)
----------
2
SQL> REM =============================================================================================
SQL> REM This is the correct (ie as I expect it to be) behaviour ! -- No Histogram on a Unique Key
SQL> REM =============================================================================================
SQL>
SQL> spool off
So we do have a case where SIZE AUTO and SIZE SKEWONLY create a Histogram on Unique Values.
Would we have such data in real life ? Certainly.
These numbers could be, say, Identifier numbers for, say, a Rewards Program membership, which are reset every 6 months or every year such that a large gap is created between each set of numbers (the Sequence generating the numbers is recreated with a large, gapped, START WITH value over the previous incarnation's LAST_NUMBER).
These could be Product Codes for different sets of products , which a range (0 to 50,000 for one Business Unit or Subsidiary, 50,000 to 100,000 for another Business Unit or subsidiary etc) allocated to each Business Unit/Subsidiary. These could be Bar Code numbers generated in batches with a deliberately created gap to preven overlapping.