10 February, 2009

Full Table Scan , Arraysize etc

Having received some questions on my last blog post about array processing, :



Can FTS(retriving more than 40%data of out some 20milion ) be optimized in the with all the three things applying at one time ?

1.increase the array size. ( I fell only for static sqls , not for the peoplesoft application engine programs)

2.put the optimum (increase) value for DB_FILE_MULTIPLE_READ_COUNT .

3. put parallel degree at Query level.

4. Defragment the table if there is any fragmentation.



I can see from documentation below information.

When Oracle performs a full table scan, the blocks are read sequentially.

The size of the read calls range from one block to the number of blocks indicated by the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT.

Using multiblock reads means a full table scan can be performed very efficiently.

Each block is read only once.(Not able to make out this).

I hope in single block read also each block will be reading only once .


this was my response :

1. ARRAYSIZE is a client-side specification.
Thus, "SET ARRAYSIZE" is specified by SQLPlus and determines how many rows will SQLPlus ask the Oracle Server process for at each round-trip. If your application is not SQLPlus but some other client, you have to check the ARRAYSIZE specification for that client. (eg TOAD, I think, in the current version, uses 500 as a default ?).

However, the number of SQLNet round trips would also be impacted by PacketSize. Default EtherNet packets (not using JumboFrames) are 1500 bytes. SQLNet SDU and TDU sizes are 2K. If you fetch more than 2K (say 3K at each array fetch) then SQLNet breaks it up into 2 SQLNet Packets. However, EtherNet would break it up into 3 Packets (the first 2000 as two packets of 1500 and 500 and the next 1000 as one packet). Of course, you have to factor in packet header overheads in both SQLNet and Ethernet. SDU and TDU are configurable but need to be set in both the tnsnames.ora and listener.ora

One issue with ARRAYSIZE is the the client also has a limited BufferSize. If the RowSize is very large then a large ARRAYSIZE results in BufferOverflow.
SQLPlus clients used to have a 32K buffer (probably still have on 32bit SQLPlus ?). Therefore SQLPlus queries would sometimes fail with a large arraysize because RowSize X ArraySize would exceed the buffer.
Again, this depends on what client software your application uses.

Also, see the other blog posting where I also show the impact of PAGESIZE and LINESIZE in SQLPlus.

Another dimension is the number of columns. Ask if you really need to SELECT * or you need SELECT CO1, COL2. The latter reduces the fetched row size significantly and, therefore, the number of rows that can fit into 1 packet is much larger , allowing a larger ARRAYSIZE.

Finally, if your operation is INSERT INTO table_1 SELECT * from table_2 then ARRAYSIZE does not matter at all. As I said, ARRAYSIZE is a client side parameter . If you are running an INSERT, then everything happens in the server. As you are talking of 40% of 20million rows, I presume that your operation is an INSERT operation. ARRAYSIZE doesn't matter there. (Also, if you are inserting across a DBLink, Oracle ignores ARRAYSIZE [I think I have blog posting on that as well]).


2. Yes, DB_FILE_MULTIBLOCK_READ_COUNT will help. The "autotuned" D_F_MB_R_C (when it is NOT specified in the system or session level parameters) works well only in 10.2.0.4.
Note that extents which are smaller than 1MB will break up your multiblock reads into smaller reads anyway. Also, for blocks present in the db_cache, the read from the extents containing those blocks would also be "broken" up.

3. Yes, Parallel as a Hint is always better than Parallel at the Table level. Better control !

4. "Fragmentation" is difficult to define. However, if your table has undergone a large purge recently it may make sense to rebuild the table.


What is the documentation (URL !!!) that says "Each block is read only once" ?

A block could contain 10 to 1000 rows. If you are doing a FullTableScan and Oracle needs all the rows, Oracle needs to "read" the block only once to get all the 1000 rows.
If you are doing an IndexRangeScan and the table is badly clustered, the IndexRangeScan may "visit" the block 1000 times , effectively "reading" it 1000 times.
For example, suppose the block contains the rows with the key values "A", "H", "F", "L", "Z". An IndexRange Scan that is doing an ordered read for all the values between say "A" and "M" would visit the block for "A", then go to another block for "B" and somewhere else for "C" …. and return to this same block again for "F", another block for "G", back here for "H" ….. You get the picture now ? In this case, a FullTableScan *may* be better , depending on how many Blocks it has to read upto the HWM versus how many key values and how many Index Blocks and how many Table Block "visits" occur. If this block contained values "A", "B", "C", "D", "E", then it has a good Clustering Factor and Oracle has to visit this block only once to read all 5 rows (if the Index has now key values between "A" and "B" and betweeen "B" and "C" etc).

No comments: