Let's assume a business case where we want to track customers who have ever placed orders with us.
(The business case could be extended to identify customers who have registered with us at least 3 years ago but have not placed any order in the last 12months --- they could either be targets for an advertising or promotions/discount campaign or customers to be retired, depending on the nature of our business and products. But I'll keep the query simple here).
Name Null? Type _________________ ___________ _______________ CUST_ID NOT NULL NUMBER(6) CUST_NAME NOT NULL VARCHAR2(56) CUST_REGN_DATE NOT NULL DATE SQL> select count(*) from customers; COUNT(*) ___________ 100 SQL> desc sales_orders; Name Null? Type _____________ ___________ _____________ ORDER_ID NOT NULL NUMBER(12) SALE_DATE DATE CUST_ID NUMBER(6) PRODUCT_ID NUMBER(6) QUANTITY NUMBER(12) SQL> SQL> select count(*) from sales_orders; COUNT(*) ___________ 1979580 SQL> select count(distinct(cust_id)) from sales_orders; COUNT(DISTINCT(CUST_ID)) ___________________________ 99 SQL> SQL> create table sale_made_tracker (cust_id number(6,0), sales_made varchar2(3)); Table SALE_MADE_TRACKER created. SQL>
Given that we have 100 customers and 1.98million sales orders, we want to add an entry to the SALE_MADE_TRACKER for a customer with a sales order. However, this is an "on-demand" query that an analyst runs from a front-end, customer-by-customer for specific customers. Thus, it is not executed as a batch for all 100 customers and 1.98million sales orders.
The developer likes to use Bind Variables and comes up with a query like this :
This allows the user to specify a customer name without knowing the cust_id when executing the query.
insert into sale_made_tracker select cust_id, 'YES' from customers a where a.cust_name = :bind_cust_name and exists (select '1' from customers c, sales_orders s where s.cust_id=c.cust_id and c.cust_id = a.cust_id)
This allows the user to specify a customer name without knowing the cust_id when executing the query.
When testing this in 23c, I've had to add the NO_UNNEST Hint so that I could see the FILTER operation (similar to what I saw in a current 12.1 database):
So, for the purpose of this demo of the FILTER operation for an EXISTS clause, I will use the second form of the query.
SQL> explain plan for 2 insert into sale_made_tracker 3 select cust_id, 'YES' 4 from customers a 5 where a.cust_name = :bind_cust_name 6 and exists 7 (select '1' 8 from customers c, sales_orders s 9 where s.cust_id=c.cust_id 10 and c.cust_id = a.cust_id) 11 / Explained. SQL> SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT _________________________________________________________________________________________________ Plan hash value: 1657757803 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | 45 | 2447 (1)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | SALE_MADE_TRACKER | | | | | | 2 | NESTED LOOPS SEMI | | 1 | 45 | 2447 (1)| 00:00:01 | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 43 | 3 (0)| 00:00:01 | | 4 | VIEW PUSHED PREDICATE | VW_SQ_1 | 208 | 416 | 2444 (1)| 00:00:01 | | 5 | NESTED LOOPS | | 20828 | 528K| 2444 (1)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | SYS_C0012587 | 1 | 13 | 1 (0)| 00:00:01 | |* 7 | TABLE ACCESS FULL | SALES_ORDERS | 20828 | 264K| 2443 (1)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("A"."CUST_NAME"=:BIND_CUST_NAME) 6 - access("C"."CUST_ID"="A"."CUST_ID") 7 - filter("S"."CUST_ID"="A"."CUST_ID") Note ----- PLAN_TABLE_OUTPUT ___________________________________________________________ - dynamic statistics used: dynamic sampling (level=2) 25 rows selected. SQL> delete plan_table; 15 rows deleted. SQL> SQL> explain plan for 2 insert into sale_made_tracker 3 select cust_id, 'YES' 4 from customers a 5 where a.cust_name = :bind_cust_name 6 and exists 7 (select /*+ NO_UNNEST */ '1' 8 from customers c, sales_orders s 9 where s.cust_id=c.cust_id 10 and c.cust_id = a.cust_id) 11 / Explained. SQL> SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT _________________________________________________________________________________________________ Plan hash value: 2963538242 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | 43 | 6 (0)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | SALE_MADE_TRACKER | | | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 43 | 3 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 1 | 26 | 3 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | SYS_C0012587 | 1 | 13 | 1 (0)| 00:00:01 | |* 6 | TABLE ACCESS FULL | SALES_ORDERS | 20828 | 264K| 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "SALES_ORDERS" "S","CUSTOMERS" "C" WHERE "C"."CUST_ID"=:B1 AND "S"."CUST_ID"=:B2)) 3 - filter("A"."CUST_NAME"=:BIND_CUST_NAME) 5 - access("C"."CUST_ID"=:B1) 6 - filter("S"."CUST_ID"=:B1) Note PLAN_TABLE_OUTPUT ___________________________________________________________ ----- - dynamic statistics used: dynamic sampling (level=2) 26 rows selected. SQL> delete plan_table; 7 rows deleted. SQL>
So, for the purpose of this demo of the FILTER operation for an EXISTS clause, I will use the second form of the query.
If I run the query for customer name "HEMANT" :
SQL> exec :bind_cust_name := 'HEMANT'
PL/SQL procedure successfully completed. SQL> SQL> insert /*+ gather_plan_statistics */ into sale_made_tracker 2 select cust_id, 'YES' 3 from customers a 4 where a.cust_name = :bind_cust_name 5 and exists 6 (select /*+ NO_UNNEST */ '1' 7 from customers c, sales_orders s 8 where s.cust_id=c.cust_id 9 and c.cust_id = a.cust_id) 10 / 1 row inserted. SQL> SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ___________________________________________________________________________________________________________ SQL_ID 1j3jggfsurwg4, child number 0 ------------------------------------- insert /*+ gather_plan_statistics */ into sale_made_tracker select cust_id, 'YES' from customers a where a.cust_name = :bind_cust_name and exists (select /*+ NO_UNNEST */ '1' from customers c, sales_orders s where s.cust_id=c.cust_id and c.cust_id = a.cust_id) Plan hash value: 2963538242 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.01 | 195 | | 1 | LOAD TABLE CONVENTIONAL | SALE_MADE_TRACKER | 1 | | 0 |00:00:00.01 | 195 | |* 2 | FILTER | | 1 | | 1 |00:00:00.01 | 193 | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 1 | 1 |00:00:00.01 | 7 | | 4 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 186 | |* 5 | INDEX UNIQUE SCAN | SYS_C0012587 | 1 | 1 | 1 |00:00:00.01 | 1 | |* 6 | TABLE ACCESS FULL | SALES_ORDERS | 1 | 20828 | 1 |00:00:00.01 | 185 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- PLAN_TABLE_OUTPUT ___________________________________________________________ 2 - filter( IS NOT NULL) 3 - filter("A"."CUST_NAME"=:BIND_CUST_NAME) 5 - access("C"."CUST_ID"=:B1) 6 - filter("S"."CUST_ID"=:B1) Note ----- - dynamic statistics used: dynamic sampling (level=2) 33 rows selected. SQL>Which shows that a total of 185 blocks were read from the SALES_ORDERS table for "HEMANT"'s orders. Plus 1 block for the Index Unique Scan and 7 blocks for the Full Table Scan of CUSOMERS (totally 193 blocks for the SELECT) {Ignore the A-Rows being 0 for the actual INSERT in the Plan output, the execution shows that 1 row was inserted)
But if I run the query for customer name "LARRY" :
SQL> exec :bind_cust_name := 'LARRY' PL/SQL procedure successfully completed. SQL> SQL> insert /*+ gather_plan_statistics */ into sale_made_tracker 2 select cust_id, 'YES' 3 from customers a 4 where a.cust_name = :bind_cust_name 5 and exists 6 (select /*+ NO_UNNEST */ '1' 7 from customers c, sales_orders s 8 where s.cust_id=c.cust_id 9 and c.cust_id = a.cust_id) 10 / 0 rows inserted. SQL> SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ___________________________________________________________________________________________________________ SQL_ID 1j3jggfsurwg4, child number 0 ------------------------------------- insert /*+ gather_plan_statistics */ into sale_made_tracker select cust_id, 'YES' from customers a where a.cust_name = :bind_cust_name and exists (select /*+ NO_UNNEST */ '1' from customers c, sales_orders s where s.cust_id=c.cust_id and c.cust_id = a.cust_id) Plan hash value: 2963538242 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.06 | 8723 | | 1 | LOAD TABLE CONVENTIONAL | SALE_MADE_TRACKER | 1 | | 0 |00:00:00.06 | 8723 | |* 2 | FILTER | | 1 | | 0 |00:00:00.06 | 8723 | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 1 | 1 |00:00:00.01 | 7 | | 4 | NESTED LOOPS | | 1 | 1 | 0 |00:00:00.06 | 8716 | |* 5 | INDEX UNIQUE SCAN | SYS_C0012587 | 1 | 1 | 1 |00:00:00.01 | 1 | |* 6 | TABLE ACCESS FULL | SALES_ORDERS | 1 | 20828 | 0 |00:00:00.06 | 8715 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- PLAN_TABLE_OUTPUT ___________________________________________________________ 2 - filter( IS NOT NULL) 3 - filter("A"."CUST_NAME"=:BIND_CUST_NAME) 5 - access("C"."CUST_ID"=:B1) 6 - filter("S"."CUST_ID"=:B1) Note ----- - dynamic statistics used: dynamic sampling (level=2) 33 rows selected. SQL>
Which shows that a total of 8,715 blocks were read for "LARRY"'s orders from the SALES_ORDERS table alone but actual rows returned was 0 -- so no rows were inserted into SALE_MADE_TRACKER.
Both "HEMANT" and "LARRY" exist in the CUSTOMERS table.
While "HEMANT" has 20,502 sales orders, "LARRY", despite being a registered customer, has not issued any order.
SQL> select cust_id, cust_name 2 from customers 3 where cust_id in (1,2) 4* / CUST_ID CUST_NAME __________ ____________ 1 HEMANT 2 LARRY SQL> SQL> select cust_id, count(*) 2 from sales_orders 3 where cust_id in (1,2) 4 group by cust_id 5 order by 1 6* / CUST_ID COUNT(*) __________ ___________ 1 20502 SQL> SQL> select /*+ gather_plan_statistics FULL (s) */ count(*) 2 from sales_orders s 3* / COUNT(*) ___________ 1979580 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ________________________________________________________________________________________________ SQL_ID 042jxgv8y5xny, child number 0 ------------------------------------- select /*+ gather_plan_statistics FULL (s) */ count(*) from sales_orders s Plan hash value: 1409771706 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.03 | 8715 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 8715 | | 2 | TABLE ACCESS FULL| SALES_ORDERS | 1 | 2082K| 1979K|00:00:00.02 | 8715 | --------------------------------------------------------------------------------------------- Note ----- - dynamic statistics used: dynamic sampling (level=2) 19 rows selected. SQL>
While "HEMANT" has 20,502 sales orders, "LARRY", despite being a registered customer, has not issued any order.
The EXISTS query for "HEMANT" had to read only 185 blocks from the SALES_ORDERS table until it was *stopped* by the FILTER operation because the FILTER here stops the subquery as soon as the first row is returned (which you can also see as actual rows being "1" from the SALES_ORDERS table) -- as soon as the subquery returns TRUE.
"LARRY" has no orders so the EXISTS query ran through the whole SALES_ORDERS table to read 8,715 blocks.
Side note on the use of the Bind Variable here : The fact that the developer used Bind Variables means that the Estimated Rows from the SALES_ORDERS table in the Execution Plan was always going to be the same -- no matter which customer name was queried for -- because it prevents the use of Histograms (if they existed) on the tale.