30 August, 2008

ASSM or MSSM ? -- DELETE and INSERT

Continuing my first set of tests, I now look at Bug#4475314 which shows that an INSERT that follows a DELETE without an intervening COMMIT can suffer much higher 'current gets'. The teste case in that Bug actually shows a very significant difference between the DELETE-COMMIT-INSERT and DELETE-INSERT runs.

So, here I ran similar SQLs as in my first set of tests :

MSSM :


drop tablespace mssm_tbs including contents and datafiles;
create tablespace mssm_tbs
datafile '/oracle_fs/Databases/ORT24FS/mssm01.dbf' size 1000M
extent management local autoallocate segment space management manual;

undefine sid
col mysid new_value sid
select distinct sid mysid from V$mystat ;


alter session set tracefile_identifier='run_delete_insert_tests_mssm';
alter session set events '10046 trace name context forever, level 8';

alter table source_table cache;
select /*+ FULL (s) */ count(*) from source_table s;

create table target_table_mssm tablespace mssm_tbs as select * from source_table where 1=2;

insert /* first_mssm_insert */ into target_table_mssm select * from source_table;
commit;

delete /* first_mssm_delete */ target_table_mssm;

insert /* reinsert_mssm_insert */ into target_table_mssm select * from source_table;
commit;

create index target_table_mssm_ndx on target_table_mssm(object_id) tablespace mssm_tbs;

delete /* indexed_mssm_delete */ target_table_mssm;

insert /* indexed_mssm_insert */ into target_table_mssm select * from source_table;
commit;


ASSM :


drop tablespace assm_tbs including contents and datafiles;
create tablespace assm_tbs
datafile '/oracle_fs/Databases/ORT24FS/assm01.dbf' size 1000M
extent management local autoallocate segment space management auto;

undefine sid
col mysid new_value sid
select distinct sid mysid from V$mystat ;


alter session set tracefile_identifier='run_delete_insert_tests_assm';
alter session set events '10046 trace name context forever, level 8';

alter table source_table cache;
select /*+ FULL (s) */ count(*) from source_table s;

create table target_table_assm tablespace assm_tbs as select * from source_table where 1=2;

insert /* first_assm_insert */ into target_table_assm select * from source_table;
commit;

delete /* first_assm_delete */ target_table_assm;

insert /* reinsert_assm_insert */ into target_table_assm select * from source_table;
commit;

create index target_table_assm_ndx on target_table_assm(object_id) tablespace assm_tbs;

delete /* indexed_assm_delete */ target_table_assm;

insert /* indexed_assm_insert */ into target_table_assm select * from source_table;
commit;



Rather than repeating the normal INSERT statements (which you can also see in my first set of tests), I will emphasise the INSERTs that followed DELETEs (without intervening COMMITs) :

MSSM :

delete /* first_mssm_delete */ target_table_mssm


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 1 6.27 12.47 0 5206 326203 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 6.27 12.48 0 5207 326203 303666

Rows Row Source Operation
------- ---------------------------------------------------
0 DELETE TARGET_TABLE_MSSM (cr=5206 pr=0 pw=0 time=12478853 us)
303666 TABLE ACCESS FULL TARGET_TABLE_MSSM (cr=4857 pr=0 pw=0 time=911106 us)

insert /* reinsert_mssm_insert */ into target_table_mssm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1.00 1.65 0 12367 17122 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.01 1.65 0 12367 17122 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=607452 us)

delete /* indexed_mssm_delete */ target_table_mssm


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 2 0 0
Execute 1 11.01 16.14 671 4325 1243651 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 11.02 16.16 671 4327 1243651 303666

Rows Row Source Operation
------- ---------------------------------------------------
0 DELETE TARGET_TABLE_MSSM (cr=4325 pr=671 pw=0 time=16140802 us)
303666 TABLE ACCESS FULL TARGET_TABLE_MSSM (cr=4325 pr=0 pw=0 time=1518587 us)

insert /* indexed_mssm_insert */ into target_table_mssm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.09 7.54 63 16018 78398 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.09 7.54 63 16018 78398 303666

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911112 us)



ASSM :

delete /* first_assm_delete */ target_table_assm


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 1 7.04 14.30 0 4880 337910 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 7.04 14.30 0 4881 337910 303666

Rows Row Source Operation
------- ---------------------------------------------------
0 DELETE TARGET_TABLE_ASSM (cr=4880 pr=0 pw=0 time=14305853 us)
303666 TABLE ACCESS FULL TARGET_TABLE_ASSM (cr=4880 pr=0 pw=0 time=934334 us)

insert /* reinsert_assm_insert */ into target_table_assm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1.50 2.02 0 21396 52236 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.50 2.02 0 21396 52236 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911120 us)

delete /* indexed_assm_delete */ target_table_assm


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 0 2 0 0
Execute 1 14.84 22.03 672 4502 1258653 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 14.84 22.06 672 4504 1258653 303666

Rows Row Source Operation
------- ---------------------------------------------------
0 DELETE TARGET_TABLE_ASSM (cr=4988 pr=673 pw=0 time=22090153 us)
303666 TABLE ACCESS FULL TARGET_TABLE_ASSM (cr=4364 pr=0 pw=0 time=1214785 us)

insert /* indexed_assm_insert */ into target_table_assm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 2.92 5.45 1 26743 117158 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 2.92 5.45 1 26743 117158 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911153 us)


The INSERT operations in the ASSM tablespace are more expensive.


MSSM ASSM
Delete
Consistent Gets 5,206 4,880
Current Gets 326,203 337,910
CPU Used (seconds) 6.27 7.04
Insert after Delete
Consistent Gets 12,367 21,396
Current Gets 17,122 52,236
CPU Used (seconds) 1.00 1.50

Delete with Index present
Consistent Gets 4,325 4,502
Current Gets 1,243,651 1,258,653
CPU Used (seconds) 11.01 14.84
Insert after this Delete
Consistent Gets 16,018 26,743
Current Gets 78,398 117,158
CPU Used (seconds) 3.09 2.92


Think carefully before using ASSM for segments that
a) Undergo large Inserts OR Deletes and Inserts
b) Are executed by batch jobs, not multiple,concurrent, online users.

23 August, 2008

The once again new forums.oracle.com

forums.oracle.com has come back online once again. An upgrade attempt in June was rolled back after the upgraded version had been "in play" for a few days. After the rollback, some posts were lost and some identities were lost.

Two weekends ago, forums.oracle.com was offline for maintenance. This Saturday (for most of the world that lives outside USA) it was offline again.
Now that it is back,

a. font size cannot be adjusted in MSIE. I might as well stop using forums.oracle.com

b. The new Legends of "Guru", "Expert", "Pro", "Journeyman" and "Newbie" are so so .... so ... what shall I say ? We have already suffered "Oracle ACE"s and "Oracle ACE Director"s. And now I suffer being labelled a "Newbie" until some unknowns start awarding me 5 points for "Helpful" answers and 10 points for "Correct" answers so that I must accumulate points to becoming a "Journeyman" and then a "Pro".

It might well be very very very soon time to say goodbye to forums.oracle.com and return to, say, google groups ?

Update : This thread is active on forums.oracle.com (so I am still logging in there yet !).

21 August, 2008

Testing Bug 4260477 Fix for Bug 4224840

Having posted about MetaLink Note#465226.1 on forums, earlier today, I have created a test case to simulate the same situation.
The base bug 4224840 has been logged by a customer that possibly found block corruption after running a transaction that locked more than 4,095 records in a block.
Considering that a table must have at least 1 column and that every data block loses some space to block header and PCTFREE, it is unlikely to have 4,095 rows in a 16KB data block. However, a 32KB data block can hold more than 4,095 rows.

Here is my Test Case Simulation on 10.2.0.1 64bit on Linux (you'd need 64bit for 32KB datablocks) :

[Note : I don't really create 4,095 records in the normal sense. I INSERT and DELETE that many times, but since I do not issue ROLLBACK or COMMIT, Oracle "preserves" each INSERT and DELETE ! ]


SQL>
SQL> REM Bug 4260477 is a fix for
SQL> REM 4224840 whereby Block Corruption (or error with db_block_checking) would occur with >4095 locks in a block
SQL> REM The fix in 4260477 is to return an error message (ORA-8007) to prevent the corruption
SQL>
SQL> REM This test simulates the issue
SQL>
SQL> drop tablespace TBS_32K including contents and datafiles;

Tablespace dropped.

SQL>
SQL> create tablespace TBS_32K datafile '/oracle_fs/Databases/ORT21FS/tbs_32K_01.dbf' size 10M autoextend on next 10M maxsize 2000M blocksize 32K
2 extent management local autoallocate segment space management auto ;

Tablespace created.

SQL>
SQL> show parameter db_block_checking

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_checking string FALSE
SQL>
SQL> drop table Test_Bug_4260477 ;
drop table Test_Bug_4260477
*
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create table Test_Bug_4260477 (col_1 varchar2(5)) pctfree 1 tablespace tbs_32K;

Table created.

SQL>
SQL> variable i number;
SQL>
SQL> set serveroutput on size 100000
SQL> begin
2 for i in 1 .. 5000
3 loop
4 insert into Test_Bug_4260477 values (to_char(i));
5 delete from Test_Bug_4260477 ;
6 if ((i-4080) > 0) then
7 dbms_output.put_line('Row ' i );
8 end if;
9 end loop;
10 end ;
11 /
Row 4081
Row 4082
Row 4083
Row 4084
Row 4085
Row 4086
Row 4087
Row 4088
Row 4089
Row 4090
Row 4091
Row 4092
Row 4093
Row 4094
begin
*
ERROR at line 1:
ORA-08007: Further changes to this block by this transaction not allowed
ORA-06512: at line 5


SQL>


The 4260477 "fix" is to return an error message (8007 : "Further changes ....") before block corruption occurs. Neat isn't it ?
Of course, that means that the transaction fails !
The final fix is in 11g.

20 August, 2008

ASSM or MSSM ? -- The impact on INSERTS

Segment Space Management 'AUTO' was introduced in 9i. For a few years, I thought that it was a good / useful feature. However, I've encountered issues with space leakage in LOBs, issues in data not being clustered (as it would have been in MSSM) and seen various bugs logged against ASSM.
I guess that ASSM makes sense only in a really very high concurrency environment -- where multiple sessions need to manipulate the FreeList for the same segment (table) concurrently. Some new features (eg ALTER TABLE SHRINK) require ASSM. Going forward with ASSM being the "default" default in 10g and 11g, I guess that it is Oracle's direction.

However, for those of us who do need to get performance out of our systems, ASSM can be a drag.
i. It seems to increase current gets when doing INSERTs
ii. Data isn't clustered into contigous blocks as it is inserted. This is important for index range scans on an indexed column that is incremented for each insert (eg a DATE or SEQUENCE). The Clustering Factor goes "bad"


Here are some simple tests that I ran, running the same INSERT (and CREATE INDEX) statements on two copies of a "target" table, one in MSSM and the other in ASSM.

Running this in one session :




set echo on
set timing on

spool run_insert_tests_mssm

create tablespace mssm_tbs
datafile '/oracle_fs/Databases/ORT24FS/mssm01.dbf' size 1000M
extent management local autoallocate segment space management manual;

undefine sid
col mysid new_value sid
select distinct sid mysid from V$mystat ;

alter session set tracefile_identifier='run_insert_tests_mssm';
alter session set events '10046 trace name context forever, level 8';

alter table source_table cache;
select /*+ FULL (s) */ count(*) from source_table s;

create table target_table_mssm tablespace mssm_tbs as select * from source_table where 1=2;

insert /* first_mssm_insert */ into target_table_mssm select * from source_table;
commit;

insert /* second_mssm_insert */ into target_table_mssm select * from source_table;
commit;

create index target_table_mssm_ndx on target_table_mssm(object_id) tablespace mssm_tbs;

insert /* indexed_mssm_insert */ into target_table_mssm select * from source_table;
commit;

select segment_name, segment_type, blocks from user_segments where segment_name like '%_MSSM%' order by segment_name ;



and running this in another session :





set echo on
set timing on

spool run_insert_tests_assm

create tablespace assm_tbs
datafile '/oracle_fs/Databases/ORT24FS/assm01.dbf' size 1000M
extent management local autoallocate segment space management auto;

undefine sid
col mysid new_value sid
select distinct sid mysid from V$mystat ;


alter session set tracefile_identifier='run_insert_tests_assm';
alter session set events '10046 trace name context forever, level 8';

alter table source_table cache;
select /*+ FULL (s) */ count(*) from source_table s;

create table target_table_assm tablespace assm_tbs as select * from source_table where 1=2;

insert /* first_assm_insert */ into target_table_assm select * from source_table;
commit;

insert /* second_assm_insert */ into target_table_assm select * from source_table;
commit;

create index target_table_assm_ndx on target_table_assm(object_id) tablespace assm_tbs;

insert /* indexed_assm_insert */ into target_table_assm select * from source_table;
commit;

select segment_name, segment_type, blocks from user_segments where segment_name like '%_ASSM%' order by segment_name;


Here's a comparison of the results :

For the first INSERT statement :





insert /* first_mssm_insert */ into target_table_mssm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.03 0 0 0 0
Execute 1 1.32 2.98 0 8771 23618 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.33 3.01 0 8771 23618 303666


Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=607550 us)


insert /* first_assm_insert */ into target_table_assm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.04 0 0 0 0
Execute 1 2.39 2.85 0 11917 38462 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 2.42 2.90 0 11917 38462 303666


Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911236 us)


The number of blocks (and rows) read from the SOURCE_TABLE is exactly the same (4,193 ,with no physical reads as the table had been cached by previous queries).
Yet, the INSERT measured up as :




MSSM ASSM
Consistent Gets 8,771 11,917
Current Gets 23,618 38,462
CPU Used (seconds) 1.32 2.39


That means that the Insert into an ASSM Tablespace causes more Current Gets and CPU usage.

These are the results of the second INSERT statement :



insert /* second_mssm_insert */ into target_table_mssm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1.58 2.70 0 8758 23393 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.58 2.70 0 8758 23393 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911108 us)

insert /* second_assm_insert */ into target_table_assm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1.33 3.19 0 12254 37667 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.34 3.19 0 12254 37667 303666


Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=607515 us)


Once again, we see a siginficant difference in Current Gets.



MSSM ASSM
Consistent Gets 8,758 12,254
Current Gets 23,393 37,667
CPU Used (seconds) 1.58 1.33


Next is the CREATE INDEX statement :


create index target_table_mssm_ndx on target_table_mssm(object_id) tablespace
mssm_tbs


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.06 0 1 0 0
Execute 1 1.62 2.11 0 9357 1688 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.63 2.18 0 9358 1688 0

create index target_table_assm_ndx on target_table_assm(object_id) tablespace
assm_tbs


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 9 0 0
Execute 1 1.83 2.20 0 9535 2188 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.84 2.21 0 9544 2188 0


Again, the ASSM performed poorer.


The final test is the Insert with the Index present.


insert /* indexed_mssm_insert */ into target_table_mssm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.12 4.26 1342 16682 119228 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.12 4.27 1342 16682 119228 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=607454 us)


insert /* indexed_assm_insert */ into target_table_assm select * from
source_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.70 5.60 1342 17923 135324 303666
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.70 5.60 1342 17923 135324 303666

Rows Row Source Operation
------- ---------------------------------------------------
303666 TABLE ACCESS FULL SOURCE_TABLE (cr=4193 pr=0 pw=0 time=911107 us)


Again, the ASSM Insert performs poorer.


MSSM ASSM
Consistent Gets 16,682 17,923
Current Gets 119,228 135,324
CPU Used (seconds) 3.12 3.70


These are the final sizes :


SQL> select segment_name, segment_type, blocks from user_segments where segment_name like '%_MSSM%' order by segment_name ;

SEGMENT_NAME SEGMENT_TYPE BLOCKS
------------------------------ ------------------ ----------
TARGET_TABLE_MSSM TABLE 13312
TARGET_TABLE_MSSM_NDX INDEX 2688
SQL> select segment_name, segment_type, blocks from user_segments where segment_name like '%_ASSM%' order by segment_name;

SEGMENT_NAME SEGMENT_TYPE BLOCKS
------------------------------ ------------------ ----------
TARGET_TABLE_ASSM TABLE 13312
TARGET_TABLE_ASSM_NDX INDEX 2816

14 August, 2008

VMWare Bug presents a nightmare scenario

A Bug in VMWare ESX 3.5 Update 2 prevented Virtual Machines from booting on 12-Aug as it (the bug) had "expired" the licences used !

"...many corporate users woke up to one of their worst virtualization nightmares: not being able to power on critical virtual machines even though the host server seemed perfectly fine ..."

I get uncomfortable with software that uses licence keys, "registry" settings etc that "expire" or "disable" software in this manner.
The expiry of SSL Certificates can cause some "loss of face". Similarly, the failure to renew domain names (remember Microsoft and passport.com and hotmail.co.uk ?) can be very embarrassing.

Preventing a User from changing his password

A few neat "tricks" to *prevent* password changes are noted at Pete Finnigan's blg.

02 August, 2008

More Tests of COL_USAGE

NOTE : This is a follow up on my earliest post on COL_USAGE.

A few more simple tests to confirm that COL_USAGE is used by 'SIZE AUTO'. Oracle does not gather Histograms on columns not identified as having been queried -- which "identification" information is stored in SYS.COL_USAGE$.



22:51:33 SQL> col object_name format a30
22:51:37 SQL> select object_name, object_id from user_objects where object_type = 'TABLE';

OBJECT_NAME OBJECT_ID
------------------------------ ----------
SOURCE_TABLE 53391
TEST_DATA 53392
OTHER_TEST_DATA 53395

22:51:38 SQL> select * from sys.col_usage$ where object_id in (53391,53392,53395);
select * from sys.col_usage$ where object_id in (53391,53392,53395)
*
ERROR at line 1:
ORA-00904: "OBJECT_ID": invalid identifier


22:51:54 SQL> select * from sys.col_usage$ where obj# in (53391,53392,53395);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26
53392 3 0 0 0 1 0 0 01-AUG 23:39:23
53392 4 0 0 0 0 1 0 01-AUG 23:45:31
53395 4 0 0 0 0 1 0 02-AUG 00:14:00

22:52:33 SQL> select table_name, column_name,histogram,num_distinct,num_buckets,sample_size,last_analyzed
22:53:53 2 from user_tab_col_statistics
22:54:00 3 order by 2,1;

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
SOURCE_TABLE CREATED NONE 1335 1 61594 02-AUG 00:27:57
OTHER_TEST_DATA CREATED_COL NONE 1262 1 30390 02-AUG 00:27:54
TEST_DATA CREATED_COL NONE 1269 1 30211 02-AUG 00:27:58
SOURCE_TABLE DATA_OBJECT_ID NONE 3962 1 5029 02-AUG 00:27:57
SOURCE_TABLE GENERATED NONE 2 1 61594 02-AUG 00:27:57
OTHER_TEST_DATA ID_COL NONE 50341 1 30390 02-AUG 00:27:54
TEST_DATA ID_COL NONE 50492 1 30211 02-AUG 00:27:58
SOURCE_TABLE LAST_DDL_TIME NONE 1373 1 61594 02-AUG 00:27:57
OTHER_TEST_DATA NAME_COL NONE 28169 1 30390 02-AUG 00:27:54
TEST_DATA NAME_COL NONE 28226 1 30211 02-AUG 00:27:58
SOURCE_TABLE OBJECT_ID NONE 50616 1 61594 02-AUG 00:27:57
SOURCE_TABLE OBJECT_NAME NONE 28952 1 61594 02-AUG 00:27:57
SOURCE_TABLE OBJECT_TYPE NONE 40 1 61594 02-AUG 00:27:57
SOURCE_TABLE OWNER NONE 26 1 61594 02-AUG 00:27:57
OTHER_TEST_DATA OWNER_COL NONE 26 1 30390 02-AUG 00:27:54
TEST_DATA OWNER_COL FREQUENCY 17 17 30211 02-AUG 00:27:58
SOURCE_TABLE SECONDARY NONE 2 1 61594 02-AUG 00:27:57
SOURCE_TABLE STATUS NONE 1 1 61594 02-AUG 00:27:57
SOURCE_TABLE SUBOBJECT_NAME NONE 102 1 505 02-AUG 00:27:57
SOURCE_TABLE TEMPORARY NONE 2 1 61594 02-AUG 00:27:57
SOURCE_TABLE TIMESTAMP NONE 1421 1 61594 02-AUG 00:27:57
OTHER_TEST_DATA TYPE_COL FREQUENCY 36 36 30390 02-AUG 00:27:54
TEST_DATA TYPE_COL FREQUENCY 38 38 30211 02-AUG 00:27:58

23 rows selected.

22:54:07 SQL> l3
3* order by 2,1
22:54:15 SQL> c/2,1/1,2
3* order by 1,2
22:54:17 SQL>
22:54:18 SQL> /

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL NONE 1262 1 30390 02-AUG 00:27:54
OTHER_TEST_DATA ID_COL NONE 50341 1 30390 02-AUG 00:27:54
OTHER_TEST_DATA NAME_COL NONE 28169 1 30390 02-AUG 00:27:54
OTHER_TEST_DATA OWNER_COL NONE 26 1 30390 02-AUG 00:27:54
OTHER_TEST_DATA TYPE_COL FREQUENCY 36 36 30390 02-AUG 00:27:54
SOURCE_TABLE CREATED NONE 1335 1 61594 02-AUG 00:27:57
SOURCE_TABLE DATA_OBJECT_ID NONE 3962 1 5029 02-AUG 00:27:57
SOURCE_TABLE GENERATED NONE 2 1 61594 02-AUG 00:27:57
SOURCE_TABLE LAST_DDL_TIME NONE 1373 1 61594 02-AUG 00:27:57
SOURCE_TABLE OBJECT_ID NONE 50616 1 61594 02-AUG 00:27:57
SOURCE_TABLE OBJECT_NAME NONE 28952 1 61594 02-AUG 00:27:57
SOURCE_TABLE OBJECT_TYPE NONE 40 1 61594 02-AUG 00:27:57
SOURCE_TABLE OWNER NONE 26 1 61594 02-AUG 00:27:57
SOURCE_TABLE SECONDARY NONE 2 1 61594 02-AUG 00:27:57
SOURCE_TABLE STATUS NONE 1 1 61594 02-AUG 00:27:57
SOURCE_TABLE SUBOBJECT_NAME NONE 102 1 505 02-AUG 00:27:57
SOURCE_TABLE TEMPORARY NONE 2 1 61594 02-AUG 00:27:57
SOURCE_TABLE TIMESTAMP NONE 1421 1 61594 02-AUG 00:27:57
TEST_DATA CREATED_COL NONE 1269 1 30211 02-AUG 00:27:58
TEST_DATA ID_COL NONE 50492 1 30211 02-AUG 00:27:58
TEST_DATA NAME_COL NONE 28226 1 30211 02-AUG 00:27:58
TEST_DATA OWNER_COL FREQUENCY 17 17 30211 02-AUG 00:27:58
TEST_DATA TYPE_COL FREQUENCY 38 38 30211 02-AUG 00:27:58

23 rows selected.

22:54:18 SQL> REM Currently there are no Histograms on SOURCE_TABLES, and a Frequency Histogram on OTHER_TEST_DATA.TYPE_COL and
22:55:06 SQL> REM on TEST_DATA.OWNER_COL, TEST_DATA.TYPE_COL
22:55:18 SQL> REM ....note that SOURCE_TABLE isn't reflected in SYS.COL_USAGE$, there being no queries on it
22:55:58 SQL>
22:55:58 SQL> REM what happens if I delete SYS.COL_USAGE$ (do NOT try this on your Production database !) and re-gather SIZE AUTO ?
22:56:25 SQL>
22:56:26 SQL> REM -- first a retest with SIZE AUTO
22:56:41 SQL>
22:56:41 SQL> exec dbms_stats.gather_schema_stats(user,estiamte_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');
BEGIN dbms_stats.gather_schema_stats(user,estiamte_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO'); END;

*
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00306: wrong number or types of arguments in call to 'GATHER_SCHEMA_STATS'
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored


22:57:03 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

22:57:34 SQL> select table_name, column_name,histogram,num_distinct,num_buckets,sample_size,last_analyzed
22:57:41 2 from user_tab_col_statistics order by 1,2;

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 22:57:26
OTHER_TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 22:57:26
OTHER_TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 22:57:26
OTHER_TEST_DATA OWNER_COL NONE 26 1 151815 02-AUG 22:57:26
OTHER_TEST_DATA TYPE_COL FREQUENCY 41 41 151815 02-AUG 22:57:26
SOURCE_TABLE CREATED NONE 1378 1 303666 02-AUG 22:57:32
SOURCE_TABLE DATA_OBJECT_ID NONE 4028 1 24684 02-AUG 22:57:32
SOURCE_TABLE GENERATED NONE 2 1 303666 02-AUG 22:57:32
SOURCE_TABLE LAST_DDL_TIME NONE 1424 1 303666 02-AUG 22:57:32
SOURCE_TABLE OBJECT_ID NONE 50611 1 303666 02-AUG 22:57:32
SOURCE_TABLE OBJECT_NAME NONE 30281 1 303666 02-AUG 22:57:32
SOURCE_TABLE OBJECT_TYPE NONE 41 1 303666 02-AUG 22:57:32
SOURCE_TABLE OWNER NONE 26 1 303666 02-AUG 22:57:32
SOURCE_TABLE SECONDARY NONE 2 1 303666 02-AUG 22:57:32
SOURCE_TABLE STATUS NONE 1 1 303666 02-AUG 22:57:32
SOURCE_TABLE SUBOBJECT_NAME NONE 102 1 2466 02-AUG 22:57:32
SOURCE_TABLE TEMPORARY NONE 2 1 303666 02-AUG 22:57:32
SOURCE_TABLE TIMESTAMP NONE 1487 1 303666 02-AUG 22:57:32
TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 22:57:34
TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 22:57:34
TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 22:57:34
TEST_DATA OWNER_COL FREQUENCY 18 18 151815 02-AUG 22:57:34
TEST_DATA TYPE_COL FREQUENCY 41 41 151815 02-AUG 22:57:34

23 rows selected.

22:57:54 SQL> REM The histograms are on OTHER_TEST_DATE.TYPE_COL and TEST_DATA.OWNER_COL and TEST_DATA.TYPE_COL
22:58:21 SQL>
22:58:22 SQL> REM Let's delete sys.col_usage$ -- do NOT try this on your Production database
22:58:38 SQL>
22:58:38 SQL> delete sys.col_usage$ where obj# in (select object_id from user_objects);

4 rows deleted.

22:58:53 SQL> commit;

Commit complete.

22:58:55 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

22:59:04 SQL>
22:59:05 SQL> REM Now, re-gather SIZE AUTO
22:59:11 SQL>
22:59:14 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

22:59:31 SQL> select table_name, column_name,histogram,num_distinct,num_buckets,sample_size,last_analyzed
22:59:39 2 from user_tab_col_statistics order by 1,2;

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 22:59:24
OTHER_TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 22:59:24
OTHER_TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 22:59:24
OTHER_TEST_DATA OWNER_COL NONE 26 1 151815 02-AUG 22:59:24
OTHER_TEST_DATA TYPE_COL NONE 41 1 151815 02-AUG 22:59:24
SOURCE_TABLE CREATED NONE 1378 1 303666 02-AUG 22:59:29
SOURCE_TABLE DATA_OBJECT_ID NONE 4028 1 24684 02-AUG 22:59:29
SOURCE_TABLE GENERATED NONE 2 1 303666 02-AUG 22:59:29
SOURCE_TABLE LAST_DDL_TIME NONE 1424 1 303666 02-AUG 22:59:29
SOURCE_TABLE OBJECT_ID NONE 50611 1 303666 02-AUG 22:59:29
SOURCE_TABLE OBJECT_NAME NONE 30281 1 303666 02-AUG 22:59:29
SOURCE_TABLE OBJECT_TYPE NONE 41 1 303666 02-AUG 22:59:29
SOURCE_TABLE OWNER NONE 26 1 303666 02-AUG 22:59:29
SOURCE_TABLE SECONDARY NONE 2 1 303666 02-AUG 22:59:29
SOURCE_TABLE STATUS NONE 1 1 303666 02-AUG 22:59:29
SOURCE_TABLE SUBOBJECT_NAME NONE 102 1 2466 02-AUG 22:59:29
SOURCE_TABLE TEMPORARY NONE 2 1 303666 02-AUG 22:59:29
SOURCE_TABLE TIMESTAMP NONE 1487 1 303666 02-AUG 22:59:29
TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 22:59:31
TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 22:59:31
TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 22:59:31
TEST_DATA OWNER_COL NONE 18 1 151815 02-AUG 22:59:31
TEST_DATA TYPE_COL NONE 41 1 151815 02-AUG 22:59:31

23 rows selected.

22:59:47 SQL> REM Even at 100% sampling, NO Histograms are gathered by SIZE AUTO.
23:00:04 SQL> REM This would be because COL_USAGE$ doesn't reflect any information on queries.
23:00:16 SQL> select count(*) from other_test_data where owner_col like 'HEMAN%';

COUNT(*)
----------
9

23:01:12 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:01:21 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53395 1 0 0 0 0 1 0 02-AUG 23:01:20

23:01:42 SQL> REM a LIKE query on OTHER_TEST_DATA.OWNER_COL (object_id 53395 is 'OTHER_TEST_DATA' and Col#1 is 'OWNER_COL') is now reflected
23:02:30 SQL>
23:02:30 SQL> REM regather 100% with SIZE AUTO
23:02:36 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

23:02:55 SQL> select table_name, column_name,histogram,num_distinct,num_buckets,sample_size,last_analyzed
23:03:07 2 from user_tab_col_statistics order by 1,2;

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 23:02:49
OTHER_TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 23:02:49
OTHER_TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 23:02:49
OTHER_TEST_DATA OWNER_COL FREQUENCY 26 26 151815 02-AUG 23:02:49
OTHER_TEST_DATA TYPE_COL NONE 41 1 151815 02-AUG 23:02:49
SOURCE_TABLE CREATED NONE 1378 1 303666 02-AUG 23:02:54
SOURCE_TABLE DATA_OBJECT_ID NONE 4028 1 24684 02-AUG 23:02:54
SOURCE_TABLE GENERATED NONE 2 1 303666 02-AUG 23:02:54
SOURCE_TABLE LAST_DDL_TIME NONE 1424 1 303666 02-AUG 23:02:54
SOURCE_TABLE OBJECT_ID NONE 50611 1 303666 02-AUG 23:02:54
SOURCE_TABLE OBJECT_NAME NONE 30281 1 303666 02-AUG 23:02:54
SOURCE_TABLE OBJECT_TYPE NONE 41 1 303666 02-AUG 23:02:54
SOURCE_TABLE OWNER NONE 26 1 303666 02-AUG 23:02:54
SOURCE_TABLE SECONDARY NONE 2 1 303666 02-AUG 23:02:54
SOURCE_TABLE STATUS NONE 1 1 303666 02-AUG 23:02:54
SOURCE_TABLE SUBOBJECT_NAME NONE 102 1 2466 02-AUG 23:02:54
SOURCE_TABLE TEMPORARY NONE 2 1 303666 02-AUG 23:02:54
SOURCE_TABLE TIMESTAMP NONE 1487 1 303666 02-AUG 23:02:54
TEST_DATA CREATED_COL NONE 1380 1 151815 02-AUG 23:02:55
TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 23:02:55
TEST_DATA NAME_COL NONE 30275 1 151815 02-AUG 23:02:55
TEST_DATA OWNER_COL NONE 18 1 151815 02-AUG 23:02:55
TEST_DATA TYPE_COL NONE 41 1 151815 02-AUG 23:02:55

23 rows selected.

23:03:15 SQL> REM Yes, we now see a Frequency Histogram (26 buckets for 26 known values) on OTHER_TEST_DATA.OWNER_COL
23:03:41 SQL> REM but no Histogram on other columns
23:03:46 SQL>
23:03:47 SQL>
23:03:47 SQL> REM Let's test SIZE SKEWONLY
23:03:55 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY');

PL/SQL procedure successfully completed.

23:04:26 SQL> select table_name, column_name,histogram,num_distinct,num_buckets,sample_size,last_analyzed
23:04:41 2 from user_tab_col_statistics order by 1,2;

TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------------------------ --------------- ------------ ----------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL HEIGHT BALANCED 1380 254 151815 02-AUG 23:04:15
OTHER_TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 23:04:15
OTHER_TEST_DATA NAME_COL HEIGHT BALANCED 30275 254 151815 02-AUG 23:04:15
OTHER_TEST_DATA OWNER_COL FREQUENCY 26 26 151815 02-AUG 23:04:15
OTHER_TEST_DATA TYPE_COL FREQUENCY 41 41 151815 02-AUG 23:04:15
SOURCE_TABLE CREATED HEIGHT BALANCED 1378 254 303666 02-AUG 23:04:23
SOURCE_TABLE DATA_OBJECT_ID HEIGHT BALANCED 4028 254 24684 02-AUG 23:04:23
SOURCE_TABLE GENERATED FREQUENCY 2 2 303666 02-AUG 23:04:23
SOURCE_TABLE LAST_DDL_TIME HEIGHT BALANCED 1424 254 303666 02-AUG 23:04:23
SOURCE_TABLE OBJECT_ID NONE 50611 1 303666 02-AUG 23:04:23
SOURCE_TABLE OBJECT_NAME HEIGHT BALANCED 30281 254 303666 02-AUG 23:04:23
SOURCE_TABLE OBJECT_TYPE FREQUENCY 41 41 303666 02-AUG 23:04:23
SOURCE_TABLE OWNER FREQUENCY 26 26 303666 02-AUG 23:04:23
SOURCE_TABLE SECONDARY FREQUENCY 2 2 303666 02-AUG 23:04:23
SOURCE_TABLE STATUS FREQUENCY 1 1 303666 02-AUG 23:04:23
SOURCE_TABLE SUBOBJECT_NAME FREQUENCY 102 102 2466 02-AUG 23:04:23
SOURCE_TABLE TEMPORARY FREQUENCY 2 2 303666 02-AUG 23:04:23
SOURCE_TABLE TIMESTAMP HEIGHT BALANCED 1487 254 303666 02-AUG 23:04:23
TEST_DATA CREATED_COL HEIGHT BALANCED 1380 254 151815 02-AUG 23:04:25
TEST_DATA ID_COL NONE 50605 1 151815 02-AUG 23:04:25
TEST_DATA NAME_COL HEIGHT BALANCED 30275 254 151815 02-AUG 23:04:25
TEST_DATA OWNER_COL FREQUENCY 18 18 151815 02-AUG 23:04:25
TEST_DATA TYPE_COL FREQUENCY 41 41 151815 02-AUG 23:04:25

23 rows selected.

23:04:52 SQL> REM Yes, SIZE SKEWYONLY did generate Histograms on most columns (except ID_COL which is *not* a candidate for Histograms)
23:05:29 SQL>
23:05:29 SQL> spool off


Testing Gather Stats behaviour based on COL_USAGE

Here I am posting some results on tests with COL_USAGE$ and GATHER_STATS behaviours of 'FOR ALL COLUMNS SIZE AUTOSKEWONLY250' with different ESTIMATE_PERCENTs.

I've put in some, what I hope are, explanatory notes in the output.

These tests were on 10.2.0.4 where a few bugs are fixed. I had slightly different results in 10.2.0.2 (a different database so different rows counts). I plan to post the same test results on other 10.2.0.1/10.2.0.3 databases, when possible. The bugs relate to a higher degree of inaccuracy in colum histograms / histogram types. 10.2.0.4 seems to be more reliable, but not 100%



23:26:08 SQL> select table_name, num_rows, last_analyzed from user_tables;

TABLE_NAME NUM_ROWS LAST_ANALYZED
------------------------------ ---------- ---------------
SOURCE_TABLE

23:26:19 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

no rows selected

23:26:38 SQL> create table test_data (owner_col varchar2(30), name_col varchar2(128), id_col number, type_col varchar2(19), created_col date);

Table created.

23:28:06 SQL> insert into test_data select owner, object_name, object_id, object_type, created from dba_objects;

50605 rows created.

23:28:28 SQL> commit;

Commit complete.

23:28:33 SQL> select object_id, data_object_id from user_objects where object_name ='TEST_DATA';

OBJECT_ID DATA_OBJECT_ID
---------- --------------
53392 53392

23:28:57 SQL> insert into test_data select * from test_data union all select * from test_data;

101210 rows created.

23:29:20 SQL> commit;

Commit complete.

23:29:24 SQL> select count(*) from test_data;

COUNT(*)
----------
151815

23:29:47 SQL> select count(*) from test_data where owner_col = 'HEMANT';

COUNT(*)
----------
9

23:30:09 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:30:26 SQL> select count(*) from test_data where owner_col = 'HEMANT';

COUNT(*)
----------
9

23:30:50 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:31:26 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26

23:31:43 SQL> select object_name, object_id from user_objects;

OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
OBJECT_ID
----------
SOURCE_TABLE
53391

TEST_DATA
53392

LEV_SEQ
53322


23:32:00 SQL> REM So, for OBJ 53392 (table TEST_DATA), COL_USAGE$ show that Col#1 (OWNER_COL has had 1 query on Equality Predicate)
23:38:44 SQL>
23:38:44 SQL> select count(*) from test_data where id_col > 1000;

COUNT(*)
----------
148953

23:39:04 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26

REM The query on iD_COL (which is INTCOL#3 in the table) isn't reflected in COL_USAGE$ yet.
REM We'll flush database monitoring to ensure that the data is reflected
REM note : A Gather_Statistics call would be expected to flush monitoring as well

23:39:13 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:39:23 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26
53392 3 0 0 0 1 0 0 01-AUG 23:39:23

23:39:31 SQL> REM The entry for Col# (ID_COL) now shows that a RangeScan was executed
23:40:09 SQL>
23:40:09 SQL> exec dbms_stats.gather_schema_stats(user);

PL/SQL procedure successfully completed.

23:40:20 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
SOURCE_TABLE 303108 60119 01-AUG 23:40:20
TEST_DATA 152116 38029 01-AUG 23:40:20

23:40:36 SQL> REM Oracle used an Auto Size to sample Table statistics.
23:40:58 SQL> REM what about columns ?

23:41:07 SQL> select column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:41:28 2 from user_tab_col_statistics
23:41:32 3 where table_name = 'TEST_DATA';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
OWNER_COL 22 3.3324E-06 22 FREQUENCY 5832 01-AUG 23:40:20
NAME_COL 28387 .000035227 1 NONE 38029 01-AUG 23:40:20
ID_COL 50487 .000019807 1 NONE 38029 01-AUG 23:40:20
TYPE_COL 31 .032258065 1 NONE 5832 01-AUG 23:40:20
CREATED_COL 935 .001069519 1 NONE 5832 01-AUG 23:40:20

23:41:49 SQL> REM Oracle gathered a Frequency Histogram on OWNER_COL.

23:42:44 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100);

PL/SQL procedure successfully completed.

23:43:06 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
SOURCE_TABLE 303666 303666 01-AUG 23:43:04
TEST_DATA 151815 151815 01-AUG 23:43:06

23:43:14 SQL> select column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:43:22 2 from user_tab_col_statistics
23:43:28 3 where table_name = 'TEST_DATA';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 01-AUG 23:43:06
NAME_COL 30275 .000033031 1 NONE 151815 01-AUG 23:43:06
ID_COL 50605 .000019761 1 NONE 151815 01-AUG 23:43:06
TYPE_COL 41 .024390244 1 NONE 151815 01-AUG 23:43:06
CREATED_COL 1380 .000724638 1 NONE 151815 01-AUG 23:43:06

23:43:33 SQL> REM Sampling is 100% but no new histograms are gathered
23:44:24 SQL>
23:44:44 SQL> select count(*) from test_data where type_col like 'P%';

COUNT(*)
----------
5268

23:44:53 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26
53392 3 0 0 0 1 0 0 01-AUG 23:39:23

23:45:06 SQL> REM Col_Usage still shows queries on OWNER_COL and ID_COL
23:45:21 SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:45:31 SQL> select * from sys.col_usage$ where obj# in (select object_id from user_objects);

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
53392 1 1 0 0 0 0 0 01-AUG 23:30:26
53392 3 0 0 0 1 0 0 01-AUG 23:39:23
53392 4 0 0 0 0 1 0 01-AUG 23:45:31

23:45:38 SQL> REM Now we see the LIKE_PREDs having been used for Col#4 (TYPE_COL)
23:46:10 SQL> exec dbms_stats.gather_schema_stats(user);

PL/SQL procedure successfully completed.

23:46:27 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
SOURCE_TABLE 303256 75814 01-AUG 23:46:27
TEST_DATA 152828 38207 01-AUG 23:46:27

23:46:40 SQL> select column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:47:10 2 from user_tab_col_statistics
23:47:23 3 where table_name = 'TEST_DATA';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
OWNER_COL 22 .000003283 22 FREQUENCY 6332 01-AUG 23:46:27
NAME_COL 28368 .000035251 1 NONE 38207 01-AUG 23:46:27
ID_COL 50662 .000019739 1 NONE 38207 01-AUG 23:46:27
TYPE_COL 30 .000003283 30 FREQUENCY 6332 01-AUG 23:46:27
CREATED_COL 950 .001052632 1 NONE 6332 01-AUG 23:46:27

23:47:36 SQL> REM Now, Table level statistics are updated and a new Histogram on TYPE_COL has been gathered
23:48:44 SQL> REM The Presence of TYPE_COL (INTCOL#=4) in SYS.COL_USAGE$ drove the 'FOR ALL COLUMN SIZE AUTO
23:49:16 SQL> REM to generate a histogram on this column
23:49:34 SQL> REM Earlier, there was no Histogram on this column
23:50:04 SQL> REM Therefore, the first one or few queries on this column would have been executed without the
23:50:23 SQL> REM advantage of Histogram !
23:50:37 SQL>
23:50:47 SQL> create table other_test_data as select * from test_data;

Table created.

23:51:01 SQL> exec dbms_Stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

23:51:11 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
OTHER_TEST_DATA
SOURCE_TABLE 303256 75814 01-AUG 23:46:27
TEST_DATA 152828 38207 01-AUG 23:46:27

23:51:20 SQL> exec dbms_stats.gather_schema_stats(user);

PL/SQL procedure successfully completed.

23:51:57 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
OTHER_TEST_DATA 151376 37844 01-AUG 23:51:56
SOURCE_TABLE 301861 51707 01-AUG 23:51:57
TEST_DATA 151992 37998 01-AUG 23:51:57

23:52:05 SQL> REM OK, we now have table statistics on OTHER_TEST_DATA
23:52:14 SQL> REM ... what about columns ?
23:52:20 SQL> select column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:52:32 2 from user_tab_col_statistics
23:52:40 3 wehre table_name = 'OTHER_TEST_DATA'
23:52:47 SQL> l3
3* wehre table_name = 'OTHER_TEST_DATA'
23:52:49 SQL> c/wehre/where
3* where table_name = 'OTHER_TEST_DATA'
23:52:55 SQL> l
1 select column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
2 from user_tab_col_statistics
3* where table_name = 'OTHER_TEST_DATA'
23:52:55 SQL> /

COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
OWNER_COL 25 .04 1 NONE 5781 01-AUG 23:51:56
NAME_COL 28382 .000035234 1 NONE 37844 01-AUG 23:51:56
ID_COL 50681 .000019731 1 NONE 37844 01-AUG 23:51:56
TYPE_COL 30 .033333333 1 NONE 5781 01-AUG 23:51:56
CREATED_COL 929 .001076426 1 NONE 5781 01-AUG 23:51:56

23:52:56 SQL> REM This table has exactly the same data as table "TEST_DATA"
23:53:10 SQL> REM Yet, Oracle's method_opt=>'FOR ALL COLUMNS SIZE AUTO' behaviour did not gather any histograms
23:53:34 SQL> REM The reason ? That sys.col_usage$ does not indicate any queries as having been executed yet !
23:53:55 SQL> REM slightly counter-intuitive ? Until queries are executed, Oracle will not decide which
23:54:18 SQL> REM histograms to create.
23:54:23 SQL> REM but the first few queries will, probably, "suffer" for lack of histograms
23:54:46 SQL>
23:54:46 SQL> exec dbms_stats.gather_schema_stats(user, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

23:55:41 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------------------ ---------- ----------- ---------------
OTHER_TEST_DATA 151724 37931 01-AUG 23:55:40
SOURCE_TABLE 302095 58785 01-AUG 23:55:41
TEST_DATA 152784 38196 01-AUG 23:55:41

23:55:54 SQL> col table_name format a18
23:56:15 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:56:23 2 from user_tab_col_statistics
23:56:27 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2,1;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
OTHER_TEST_DATA CREATED_COL 903 .00110742 1 NONE 4915 01-AUG 23:55:40
TEST_DATA CREATED_COL 899 .001112347 1 NONE 5021 01-AUG 23:55:41
OTHER_TEST_DATA ID_COL 50377 .00001985 1 NONE 37931 01-AUG 23:55:40
TEST_DATA ID_COL 50762 .0000197 1 NONE 38196 01-AUG 23:55:41
OTHER_TEST_DATA NAME_COL 28466 .00003513 1 NONE 37931 01-AUG 23:55:40
TEST_DATA NAME_COL 28556 .000035019 1 NONE 38196 01-AUG 23:55:41
OTHER_TEST_DATA OWNER_COL 23 .043478261 1 NONE 4915 01-AUG 23:55:40
TEST_DATA OWNER_COL 20 .05 1 NONE 5021 01-AUG 23:55:41
OTHER_TEST_DATA TYPE_COL 32 .03125 1 NONE 4915 01-AUG 23:55:40
TEST_DATA TYPE_COL 31 .032258065 1 NONE 5021 01-AUG 23:55:41

10 rows selected.

23:57:05 SQL> REM All the columns are without Histograms
23:57:16 SQL> REM The "method_opt=>'FOR ALL COLUMNS SIZE 1'" has deleted the Histograms
23:57:45 SQL>
23:57:55 SQL> exec dbms_stats.gather_schema_stats(user,method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY');

PL/SQL procedure successfully completed.

23:58:11 SQL> select table_name, num_rows, sample_size, last_analyzed from user_tables order by 1;

TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
------------------ ---------- ----------- ---------------
OTHER_TEST_DATA 151456 37864 01-AUG 23:58:07
SOURCE_TABLE 303666 303666 01-AUG 23:58:10
TEST_DATA 152804 38201 01-AUG 23:58:11

23:58:19 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
23:58:28 2 from user_tab_col_statistics
23:58:35 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 957 .002409639 254 HEIGHT BALANCED 6456 01-AUG 23:58:11
OTHER_TEST_DATA CREATED_COL 886 .00243309 254 HEIGHT BALANCED 5061 01-AUG 23:58:07
TEST_DATA ID_COL 50294 .000019883 1 NONE 38201 01-AUG 23:58:11
OTHER_TEST_DATA ID_COL 50622 .000019754 1 NONE 37864 01-AUG 23:58:07
TEST_DATA NAME_COL 28410 .000052648 254 HEIGHT BALANCED 38201 01-AUG 23:58:11
OTHER_TEST_DATA NAME_COL 28447 .000052918 254 HEIGHT BALANCED 37864 01-AUG 23:58:07
TEST_DATA OWNER_COL 23 3.3141E-06 23 FREQUENCY 6456 01-AUG 23:58:11
OTHER_TEST_DATA OWNER_COL 20 3.2683E-06 20 FREQUENCY 5061 01-AUG 23:58:07
TEST_DATA TYPE_COL 31 3.3141E-06 31 FREQUENCY 6456 01-AUG 23:58:11
OTHER_TEST_DATA TYPE_COL 31 3.2683E-06 31 FREQUENCY 5061 01-AUG 23:58:07

10 rows selected.

23:58:47 SQL> REM Aah ! SIZE SKEWONLY caused Histograms to be gathered on most columns
23:59:13 SQL> REM but do you notice that sample sizes are different between the two tables ?
23:59:26 SQL> REM and that estmated "num_distinct" and "num_buckets" aren't the same ! (even though the data is the same !)
23:59:51 SQL> REM sampling errors have creeped in !
REM I'd like to test this back in 10.2.0.x where x is less than 4 !!
00:00:00 SQL>
00:00:01 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY');

PL/SQL procedure successfully completed.

00:00:41 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:00:49 2 from user_tab_col_statistics
00:00:55 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1380 .002350099 254 HEIGHT BALANCED 151815 02-AUG 00:00:41
OTHER_TEST_DATA CREATED_COL 1380 .002350099 254 HEIGHT BALANCED 151815 02-AUG 00:00:30
TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:00:41
OTHER_TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:00:30
TEST_DATA NAME_COL 30275 .0000387 254 HEIGHT BALANCED 151815 02-AUG 00:00:41
OTHER_TEST_DATA NAME_COL 30275 .0000387 254 HEIGHT BALANCED 151815 02-AUG 00:00:30
TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:00:41
OTHER_TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:00:30
TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:00:41
OTHER_TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:00:30

10 rows selected.

00:01:03 SQL> REM Now, with 100% sampling and SIZE SKEWONLY, we have "better" histograms
00:02:18 SQL> REM (see, e.g. how OWNER_COL's NUM_DISTINCT is up from 23 and 20 to 26 and 26 for the two tables)
00:02:45 SQL> REM (or, TYPE_COL NUM_DISTINCT is up from 31 to 41)
REM I'd like to test this back in 10.2.0.x where x is less than 4 !!
00:03:01 SQL>
00:09:42 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

00:11:02 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:11:09 2 from user_tab_col_statistics
00:11:15 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:11:02
OTHER_TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:10:54
TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:11:02
OTHER_TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:10:54
TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:11:02
OTHER_TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:10:54
TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:11:02
OTHER_TEST_DATA OWNER_COL 26 .038461538 1 NONE 151815 02-AUG 00:10:54
TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:11:02
OTHER_TEST_DATA TYPE_COL 41 .024390244 1 NONE 151815 02-AUG 00:10:54

10 rows selected.

00:11:21 SQL> REM Note how the "SIZE AUTO" reverted to Histograms on OWNER_COL and TYPE_COL only in TEST_DATA !
00:12:29 SQL> REM It isn't intelligent enough to figure out that OTHER_TEST_DATA's columns are also candidates
00:13:03 SQL>
00:13:20 SQL> select count(*) from other_test_data where type_col like 'T%';

COUNT(*)
----------
12291

00:13:34 SQL> exec dbms_Stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

00:14:00 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

00:14:27 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:14:28 2 from user_tab_col_statistics
00:14:34 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:14:27
OTHER_TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:14:19
TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:14:27
OTHER_TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:14:19
TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:14:27
OTHER_TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:14:19
TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:14:27
OTHER_TEST_DATA OWNER_COL 26 .038461538 1 NONE 151815 02-AUG 00:14:19
TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:14:27
OTHER_TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:14:19

10 rows selected.

00:14:41 SQL> REM Now it has gathered a Histogram on OTHER_TEST_DATA.TYPE_COL because it is now reflected in SYS.COL_USAGE$
00:15:23 SQL> select t.table_name, cu.* from user_objects t, sys.col_usage$ cu
00:15:55 2 where t.object_id=cu.obj#
00:16:04 SQL> l1
1* select t.table_name, cu.* from user_objects t, sys.col_usage$ cu
00:16:07 SQL> c/table/object
1* select t.object_name, cu.* from user_objects t, sys.col_usage$ cu
00:16:10 SQL> l
1 select t.object_name, cu.* from user_objects t, sys.col_usage$ cu
2* where t.object_id=cu.obj#
00:16:11 SQL> /

OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------------
TEST_DATA
53392 4 0 0 0 0 1 0 01-AUG 23:45:31

TEST_DATA
53392 3 0 0 0 1 0 0 01-AUG 23:39:23

TEST_DATA
53392 1 1 0 0 0 0 0 01-AUG 23:30:26

OTHER_TEST_DATA
53395 4 0 0 0 0 1 0 02-AUG 00:14:00


00:16:12 SQL> REM Col#4 in both tables is TYPE_COL
00:16:45 SQL> REM In both tables, I have run a LIKE query on this column
00:17:02 SQL> REM The TIMESTAMP shows that there has been no query on TEST_DATA recently, but there has been 1 query on OTHER_TEST_DATA
00:17:32 SQL>
00:17:32 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

00:18:43 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:18:50 2 from user_tab_col_statistics
00:19:04 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:18:43
OTHER_TEST_DATA CREATED_COL 1380 .000724638 1 NONE 151815 02-AUG 00:18:36
TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:18:43
OTHER_TEST_DATA ID_COL 50605 .000019761 1 NONE 151815 02-AUG 00:18:36
TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:18:43
OTHER_TEST_DATA NAME_COL 30275 .000033031 1 NONE 151815 02-AUG 00:18:36
TEST_DATA OWNER_COL 26 .038461538 1 NONE 151815 02-AUG 00:18:43
OTHER_TEST_DATA OWNER_COL 26 .038461538 1 NONE 151815 02-AUG 00:18:36
TEST_DATA TYPE_COL 41 .024390244 1 NONE 151815 02-AUG 00:18:43
OTHER_TEST_DATA TYPE_COL 41 .024390244 1 NONE 151815 02-AUG 00:18:36

10 rows selected.

00:19:08 SQL> REM Now, the "SIZE 1" has deleted all Histograms
00:19:21 SQL>
00:19:21 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE 250');

PL/SQL procedure successfully completed.

00:19:51 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:19:53 2
00:20:00 SQL>
00:20:01 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:20:05 2 from user_tab_col_statistics
00:20:11 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1380 .002372155 250 HEIGHT BALANCED 151815 02-AUG 00:19:51
OTHER_TEST_DATA CREATED_COL 1380 .002372155 250 HEIGHT BALANCED 151815 02-AUG 00:19:41
TEST_DATA ID_COL 50605 .000019761 250 HEIGHT BALANCED 151815 02-AUG 00:19:51
OTHER_TEST_DATA ID_COL 50605 .000019761 250 HEIGHT BALANCED 151815 02-AUG 00:19:41
TEST_DATA NAME_COL 30275 .0000387 250 HEIGHT BALANCED 151815 02-AUG 00:19:51
OTHER_TEST_DATA NAME_COL 30275 .0000387 250 HEIGHT BALANCED 151815 02-AUG 00:19:41
TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:19:51
OTHER_TEST_DATA OWNER_COL 26 3.2935E-06 26 FREQUENCY 151815 02-AUG 00:19:41
TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:19:51
OTHER_TEST_DATA TYPE_COL 41 3.2935E-06 41 FREQUENCY 151815 02-AUG 00:19:41

10 rows selected.

00:20:21 SQL> REM Conclusions :
00:21:30 SQL> REM 1. SIZE AUTO relies on prior queries being reflected on SYS.COL_USAGE$
00:21:48 SQL> REM If this is a new table or the column hasn't been queried, Oracle will not gather Histograms even if it is a candidate
00:22:15 SQL> REM 2. SIZE SKEWONLY gathers histograms irrespective of whether the table and/or column has been queried before
00:22:48 SQL> REM 3. The Sample Size is a signficant determinant of accuracy (NUM_DISTINCT and NUM_BUCKETS)
REM Another thing to test in 10.2.0.[1-3]
00:23:23 SQL>
00:23:24 SQL>
00:23:24 SQL>
REM Causing a very high degree of skew
00:24:18 SQL> update test_data set owner_col = 'HEMANT_X' where mod(id_col,100) != 0;

150273 rows updated.

00:24:43 SQL> commit;

Commit complete.

00:24:45 SQL> exec dbms_stats.gather_schema_stats(user);

PL/SQL procedure successfully completed.

00:25:15 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:25:26 2 from user_tab_col_statistics
00:25:32 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 934 .001070664 1 NONE 5950 02-AUG 00:25:15
OTHER_TEST_DATA CREATED_COL 941 .001062699 1 NONE 5730 02-AUG 00:25:13
TEST_DATA ID_COL 50279 .000019889 1 NONE 37863 02-AUG 00:25:15
OTHER_TEST_DATA ID_COL 50729 .000019713 1 NONE 37813 02-AUG 00:25:13
TEST_DATA NAME_COL 28255 .000035392 1 NONE 37863 02-AUG 00:25:15
OTHER_TEST_DATA NAME_COL 28643 .000034913 1 NONE 37813 02-AUG 00:25:13
TEST_DATA OWNER_COL 10 3.2925E-06 10 FREQUENCY 5950 02-AUG 00:25:15
OTHER_TEST_DATA OWNER_COL 24 .041666667 1 NONE 5730 02-AUG 00:25:13
TEST_DATA TYPE_COL 35 3.2925E-06 35 FREQUENCY 5950 02-AUG 00:25:15
OTHER_TEST_DATA TYPE_COL 30 3.2623E-06 30 FREQUENCY 5730 02-AUG 00:25:13

10 rows selected.

00:25:42 SQL> select owner_col, count(*) from test_data group by owner_col;

OWNER_COL COUNT(*)
------------------------------ ----------
MDSYS 27
DMSYS 3
PUBLIC 987
OLAPSYS 27
CTXSYS 12
SYSTEM 9
EXFSYS 6
SH 9
ORDSYS 48
ORDPLUGINS 3
SYSMAN 39
OE 6
XDB 27
IX 3
BI 3
HEMANT_X 150273
SYS 321
WMSYS 12

18 rows selected.

00:26:11 SQL> REM With very high skew (HEMANT_X accounting for 99% of the rows)
00:26:28 SQL> REM the AUTO_SAMPLE_SIZE of 5,950 generated a Frequency Histogram of 10 buckets for 10 distinct values
00:27:17 SQL> REM However, the column actually has 18 distint values !!
REM Another thing to test in 10.2.0.[1-3]

00:27:33 SQL> exec dbms_stats.gather_schema_stats(user,estimate_percent=>20);

PL/SQL procedure successfully completed.

00:27:58 SQL> select table_name, column_name, num_distinct, density, num_buckets, histogram, sample_size, last_analyzed
00:28:13 2 from user_tab_col_statistics
00:28:24 3 where table_name in ('TEST_DATA','OTHER_TEST_DATA') order by 2, 1 desc;

TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_BUCKETS HISTOGRAM SAMPLE_SIZE LAST_ANALYZED
------------------ ------------------------------ ------------ ---------- ----------- --------------- ----------- ---------------
TEST_DATA CREATED_COL 1269 .000788022 1 NONE 30211 02-AUG 00:27:58
OTHER_TEST_DATA CREATED_COL 1262 .000792393 1 NONE 30390 02-AUG 00:27:54
TEST_DATA ID_COL 50492 .000019805 1 NONE 30211 02-AUG 00:27:58
OTHER_TEST_DATA ID_COL 50341 .000019865 1 NONE 30390 02-AUG 00:27:54
TEST_DATA NAME_COL 28226 .000035428 1 NONE 30211 02-AUG 00:27:58
OTHER_TEST_DATA NAME_COL 28169 .0000355 1 NONE 30390 02-AUG 00:27:54
TEST_DATA OWNER_COL 17 3.3101E-06 17 FREQUENCY 30211 02-AUG 00:27:58
OTHER_TEST_DATA OWNER_COL 26 .038461538 1 NONE 30390 02-AUG 00:27:54
TEST_DATA TYPE_COL 38 3.3101E-06 38 FREQUENCY 30211 02-AUG 00:27:58
OTHER_TEST_DATA TYPE_COL 36 3.2906E-06 36 FREQUENCY 30390 02-AUG 00:27:54

10 rows selected.

00:28:33 SQL> REM With a 20% sample, Oracle came up with an estimate of 17 distinct values.
00:29:36 SQL>
00:29:37 SQL> spool off




I hope that you can follow what might seem haphazard testing.
I hadn't prepared a "test script" with all the queries in advance but was building the queries as I went along, based on results of preceding queries.