26 September, 2016

SQLLoader DIRECT option and Unique Index

The DIRECT parameter for SQLLoader Command-Line enables Direct Path Load which uses a Direct Path API instead of a regular INSERT statement to load data into the target table.

However, one needs to know how it handles a Unique Index on the target table.
It actually leaves the  UNUSABLE if, duplicate values are loaded. The Index rebuild fails but the duplicate values remain in the table.

Here is a quick demo  (this in 12.1.0.2 MultiTenant). I first setup the target table with a Unique Index.

[oracle@ora12102 Desktop]$ sqlplus hemant/hemant@PDB1

SQL*Plus: Release 12.1.0.2.0 Production on Mon Sep 26 22:36:51 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Sep 26 2016 22:26:16 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> show con_id

CON_ID
------------------------------
3
SQL> create table test_sqlldr_direct
  2  (id_column number,
  3  data_column varchar2(15))
  4  /

Table created.

SQL> create unique index test_sqlldr_direct_u1 on test_sqlldr_direct(id_column);

Index created.

SQL> insert into test_sqlldr_direct values (1, 'First Row');

1 row created.

SQL> commit;

Commit complete.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
[oracle@ora12102 Desktop]$ 

Next, I setup the datafile with a duplicate record and the controlfile.

[oracle@ora12102 Desktop]$ ls -l
total 8
-rw-r--r-- 1 oracle oinstall   40 Sep 26 22:40 load_data.dat
-rw-r--r-- 1 oracle oinstall  165 Sep 26 22:45 load_control.ctl
[oracle@ora12102 Desktop]$ cat load_data.dat
2,'Second Row'
3,'Third Row'
3,'Oops !'
[oracle@ora12102 Desktop]$ cat load_control.ctl
LOAD DATA
INFILE load_data.dat
APPEND  INTO TABLE TEST_SQLLDR_DIRECT
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
id_column,
data_column)
[oracle@ora12102 Desktop]$ 

I am now ready to run a Direct Path Load.

[oracle@ora12102 Desktop]$ sqlldr hemant/hemant@pdb1 control=load_control.ctl direct=TRUE

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Sep 26 22:47:09 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct

Load completed - logical record count 3.

Table TEST_SQLLDR_DIRECT:
  3 Rows successfully loaded.

Check the log file:
  load_control.log
for more information about the load.
[oracle@ora12102 Desktop]$ 

What is that ? 3 rows loaded successfully ?  So, the duplicate row also did get loaded ?  Let's check the log file.

[oracle@ora12102 Desktop]$ ls -ltr
total 12
-rw-r--r-- 1 oracle oinstall   40 Sep 26 22:40 load_data.dat
-rw-r--r-- 1 oracle oinstall  165 Sep 26 22:45 load_control.ctl
-rw-r--r-- 1 oracle oinstall 1833 Sep 26 22:47 load_control.log
[oracle@ora12102 Desktop]$ cat load_control.log

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Sep 26 22:47:09 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Control File:   load_control.ctl
Data File:      load_data.dat
  Bad File:     load_data.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation:    none specified
Path used:      Direct

Table TEST_SQLLDR_DIRECT, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID_COLUMN                           FIRST     *   ,  O(") CHARACTER            
DATA_COLUMN                          NEXT     *   ,  O(") CHARACTER            

The following index(es) on table TEST_SQLLDR_DIRECT were processed:
index HEMANT.TEST_SQLLDR_DIRECT_U1 was made unusable due to:
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found

Table TEST_SQLLDR_DIRECT:
  3 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:             3
Total logical records rejected:         0
Total logical records discarded:        0
Total stream buffers loaded by SQL*Loader main thread:        1
Total stream buffers loaded by SQL*Loader load thread:        0

Run began on Mon Sep 26 22:47:09 2016
Run ended on Mon Sep 26 22:47:11 2016

Elapsed time was:     00:00:01.88
CPU time was:         00:00:00.01
[oracle@ora12102 Desktop]$ 

Did you notice the section in the log file that says :
The following index(es) on table TEST_SQLLDR_DIRECT were processed:
index HEMANT.TEST_SQLLDR_DIRECT_U1 was made unusable due to:
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found

Apparently, the Index is left UNUSABLE.

[oracle@ora12102 Desktop]$ sqlplus hemant/hemant@pdb1

SQL*Plus: Release 12.1.0.2.0 Production on Mon Sep 26 22:50:51 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Sep 26 2016 22:47:09 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> select status from user_indexes     
  2  where index_name = 'TEST_SQLLDR_DIRECT_U1'        
  3  /

STATUS
--------
UNUSABLE

SQL> select * from test_sqlldr_direct order by 1;

 ID_COLUMN DATA_COLUMN
---------- ---------------
  1 First Row
  2 'Second Row'
  3 'Third Row'
  3 'Oops !'

SQL> alter index test_sqlldr_direct_u1 rebuild;
alter index test_sqlldr_direct_u1 rebuild
*
ERROR at line 1:
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found


SQL> 

We can see the duplicated row for ID_COLUMN=3 and find that the Index cannot be rebuilt.  Oracle has allowed duplicate rows to load and left the Index UNUSABLE.

So, if you are planning to use DIRECT=TRUE and have a Unique Index, make sure you check the status of the Index and/or check the Log file before you proceed with processing the data.


Conversely, here is how the data is handled without DIRECT=TRUE :.

SQL> truncate table test_sqlldr_direct;

Table truncated.

SQL> insert into test_sqlldr_direct values (1,'First Row');

1 row created.

SQL> select status from user_indexes
  2  where index_name = 'TEST_SQLLDR_DIRECT_U1'
  3  /

STATUS
--------
VALID

SQL> 

[oracle@ora12102 Desktop]$ rm load_control.log
[oracle@ora12102 Desktop]$ sqlldr hemant/hemant@PDB1 control=load_control.ctl

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Sep 26 22:59:58 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Path used:      Conventional
Commit point reached - logical record count 3

Table TEST_SQLLDR_DIRECT:
  2 Rows successfully loaded.

Check the log file:
  load_control.log
for more information about the load.
[oracle@ora12102 Desktop]$ 
[oracle@ora12102 Desktop]$ ls -ltr
total 16
-rw-r--r-- 1 oracle oinstall   40 Sep 26 22:40 load_data.dat
-rw-r--r-- 1 oracle oinstall  165 Sep 26 22:45 load_control.ctl
-rw-r--r-- 1 oracle oinstall   11 Sep 26 22:59 load_data.bad
-rw-r--r-- 1 oracle oinstall 1668 Sep 26 22:59 load_control.log
[oracle@ora12102 Desktop]$ cat load_data.bad
3,'Oops !'
[oracle@ora12102 Desktop]$ cat load_control.log

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Sep 26 22:59:58 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Control File:   load_control.ctl
Data File:      load_data.dat
  Bad File:     load_data.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional

Table TEST_SQLLDR_DIRECT, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID_COLUMN                           FIRST     *   ,  O(") CHARACTER            
DATA_COLUMN                          NEXT     *   ,  O(") CHARACTER            

Record 3: Rejected - Error on table TEST_SQLLDR_DIRECT.
ORA-00001: unique constraint (HEMANT.TEST_SQLLDR_DIRECT_U1) violated


Table TEST_SQLLDR_DIRECT:
  2 Rows successfully loaded.
  1 Row not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                  33024 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:             3
Total logical records rejected:         1
Total logical records discarded:        0

Run began on Mon Sep 26 22:59:58 2016
Run ended on Mon Sep 26 22:59:58 2016

Elapsed time was:     00:00:00.07
CPU time was:         00:00:00.00
[oracle@ora12102 Desktop]$ 

SQL> select * from test_sqlldr_direct 
  2  order by id_column
  3  /

 ID_COLUMN DATA_COLUMN
---------- ---------------
  1 First Row
  2 'Second Row'
  3 'Third Row'

SQL> 
SQL> select status from user_indexes
  2  where index_name = 'TEST_SQLLDR_DIRECT_U1'
  3  /

STATUS
--------
VALID

SQL> 

The duplicate row was rejected and went to the BAD file and was REJECTED.
.
.
.
So, the next time you chose to use SQLLDR DIRECT=TRUE for its performance benefits, make sure you know how to validate the status of Unique Indexes and check the log file ! Else, you might allow duplicates if the incoming data has not been cleansed before loading into the database.
.
.
.
If you are running 11g, the behaviour is the same in 11g/
.
.
.
.

No comments: