b¬äd†ødd

Dedicated to design and performance of databases and audio systems.

Set Operators and the ALL Option

While on a recent trip, I was asked a question about the ALL option on the MINUS set operator. It was along the lines of having two duplicate records in the top table of a MINUS and one matching record in the bottom table. If ALL was used, would one of the duplicate records of the top table still be present with the MINUS ALL?

First, let's back up and refresh on SET operators and the ALL option. SET operators allow us to join data records in a vertical fashion, whereas traditional joins (e.g.INNER JOIN, LEFT OUTER JOIN) join data horizontally. A UNION combines two sets of data. An EXCEPT (more commonly used as a MINUS) subtracts records in the bottom set from the top set. Lastly, INTERSECT takes only the records that exist in the top and bottom sets.

Second, Teradata is strict algebraic in its principles. You have probably heard me say that before. Here, what we mean by that is that a SET is unique, there are no duplicate records. Much like a SET table does not allow duplicate records in it, neither does a SET operator. So, to ensure uniqueness, a DISTINCT process must be invoked. As we have experienced with DISTINCT functionality, it could be expensive.

The way to circumvent the uniqueness check is to use the ALL option (e.g. UNION ALL). ALL allows duplicates much like a MULTISET table does. However, you want to be sure that your process would not produce duplicate records if they were not intended; otherwise, if your sets are mutually-exclusive, then by all means use the ALL option to reduce overhead.

Let's demonstrate with our tables, data, and we'll start with UNION:

-- top table
create multiset volatile table top_t, no log (
  id  integer  	not null,
  dte date     	not null,
  cde character(1) not null)
primary index (id)
on commit preserve rows
;
 
-- bottom table
create multiset volatile table bot_t, no log (
  id  integer  	not null,
  dte date     	not null,
  cde character(1) not null)
primary index (id)
on commit preserve rows
;
 
-- data load
insert into top_t (1,date '2013-04-24','J');
insert into top_t (2,date '2013-04-24','E');
insert into top_t (2,date '2013-04-24','F');
insert into top_t (3,date '2013-04-24','F');
 
insert into bot_t (2,date '2013-04-24','Y');
insert into bot_t (3,date '2013-04-24','B');
 
-- statistics
collect statistics on top_t index  (id);
collect statistics on top_t column (partition);
collect statistics on top_t column (dte);
collect statistics on bot_t index  (id);
collect statistics on bot_t column (partition);
collect statistics on bot_t column (dte);
 
-- UNION
explain
select id,dte
from top_t
union
select id,dte
from bot_t
;
 
-- UNION ALL
explain
select id,dte
from top_t
union all
select id,dte
from bot_t
;

The resulting EXPLAIN plans, UNION and UNION ALL, respectively

  1) First, we do an all-AMPs RETRIEVE step from top_t by way
 	of an all-rows scan with no residual conditions into Spool 1
 	(group_amps), which is redistributed by the hash code of (
 	top_t.dte, top_t.id) to all AMPs.  The size of
 	Spool 1 is estimated with high confidence to be 5 rows (185 bytes).
 	The estimated time for this step is 0.01 seconds.
  2) Next, we do an all-AMPs RETRIEVE step from bot_t by way
 	of an all-rows scan with no residual conditions into Spool 1
 	(group_amps), which is redistributed by the hash code of (
 	bot_t.dte, bot_t.id) to all AMPs.  Then we do a
 	SORT to order Spool 1 by the sort key in spool field1 eliminating
 	duplicate rows.  The size of Spool 1 is estimated with low
 	confidence to be 5 rows (185 bytes).  The estimated time for this
 	step is 0.01 seconds.
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
 	in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
 	statement 1.  The total estimated time is 0.02 seconds.
 
  1) First, we do an all-AMPs RETRIEVE step from top_t by way
 	of an all-rows scan with no residual conditions into Spool 1
 	(group_amps), which is built locally on the AMPs.  The size of
 	Spool 1 is estimated with high confidence to be 5 rows (145 bytes).
 	The estimated time for this step is 0.01 seconds.
  2) Next, we do an all-AMPs RETRIEVE step from bot_t by way
 	of an all-rows scan with no residual conditions into Spool 1
 	(group_amps), which is built locally on the AMPs.  The size of
 	Spool 1 is estimated with high confidence to be 8 rows (232 bytes).
 	The estimated time for this step is 0.01 seconds.
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
 	in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
 	statement 1.  The total estimated time is 0.02 seconds.

Here, with the UNION and UNION ALL, each have two RETRIEVE steps, but the UNION has a SORT and an elimination of duplicate rows function post the merge. And, prior to the merge, it is rehashing and redistributing the records for the duplication elimination. This is exactly like a DISTINCT process. The UNION ALL does no duplication elimination process, no SORT, and no rehashing and redistribution. Now, our test is using very small data, but extrapolate the process to much large volumes and you can see the amount of work that the UNION must do relative to the UNION ALL.

However, our original question was regarding the MINUS versus MINUS ALL. So, let's run a MINUS/MINUS ALL, respectively, and compare on the same data and see what processes occur.

  1) First, we execute the following steps in parallel.
   	1) We do an all-AMPs RETRIEVE step from top_t by way of
      	an all-rows scan with no residual conditions into Spool 1
      	(all_amps), which is redistributed by the hash code of (
      	top_t.dte, top_t.id) to all AMPs.  Then we
      	do a SORT to order Spool 1 by row hash and the sort key in
      	spool field1 eliminating duplicate rows.  The size of Spool 1
      	is estimated with low confidence to be 3 rows (111 bytes).
      	The estimated time for this step is 0.01 seconds.
   	2) We do an all-AMPs RETRIEVE step from bot_t by way of
      	an all-rows scan with no residual conditions into Spool 2
      	(all_amps), which is redistributed by the hash code of (
      	bot_t.dte, bot_t.id) to all AMPs.  Then we
      	do a SORT to order Spool 2 by row hash and the sort key in
      	spool field1 eliminating duplicate rows.  The size of Spool 2
      	is estimated with high confidence to be 2 rows (74 bytes).
      	The estimated time for this step is 0.01 seconds.
  2) Next, we do an all-AMPs JOIN step from Spool 1 (Last Use) by way
 	of an all-rows scan, which is joined to Spool 2 (Last Use) by way
 	of an all-rows scan.  Spool 1 and Spool 2 are joined using an
 	exclusion merge join, with a join condition of ("Field_1 = Field_1").
 	The result goes into Spool 3 (group_amps), which is built locally
 	on the AMPs.  The size of Spool 3 is estimated with low confidence
 	to be 2 rows (74 bytes).  The estimated time for this step is 0.01
 	seconds.
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
 	in processing the request.
  -> The contents of Spool 3 are sent back to the user as the result of
 	statement 1.  The total estimated time is 0.02 seconds.
 
  1) First, we execute the following steps in parallel.
   	1) We do an all-AMPs RETRIEVE step from top_t by way of
      	an all-rows scan with no residual conditions into Spool 1
      	(all_amps), which is redistributed by the hash code of (
      	top_t.dte, top_t.id) to all AMPs.  Then we
      	do a SORT to order Spool 1 by the sort key in spool field1.
      	The size of Spool 1 is estimated with high confidence to be 5
      	rows (185 bytes).  The estimated time for this step is 0.01
      	seconds.
   	2) We do an all-AMPs RETRIEVE step from bot_t by way of
      	an all-rows scan with no residual conditions into Spool 2
      	(all_amps), which is redistributed by the hash code of (
      	bot_t.dte, bot_t.id) to all AMPs.  Then we
      	do a SORT to order Spool 2 by the sort key in spool field1.
      	The size of Spool 2 is estimated with high confidence to be 3
      	rows (111 bytes).  The estimated time for this step is 0.01
      	seconds.
  2) Next, we do an all-AMPs JOIN step from Spool 1 (Last Use) by way
 	of an all-rows scan, which is joined to Spool 2 (Last Use) by way
 	of an all-rows scan.  Spool 1 and Spool 2 are joined using a MINUS
 	ALL join, with a join condition of ("Field_1 = Field_1").  The
 	result goes into Spool 3 (group_amps), which is built locally on
 	the AMPs.  Then we do a SORT to order Spool 3 by the sort key in
 	spool field1 (top_t.id, top_t.dte).  The size of
 	Spool 3 is estimated with low confidence to be 4 rows (148 bytes).
 	The estimated time for this step is 0.01 seconds.
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
 	in processing the request.
  -> The contents of Spool 3 are sent back to the user as the result of
 	statement 1.  The total estimated time is 0.02 seconds.

Here the MINUS query performs the SORT, the rehash and redistribute of records, and the duplication elimination process. However, the MINUS ALL process performs the SORT, the rehash and redistribution as well. The MINUS has an AMP-local process in step 2, but the MINUS ALL performs another SORT in step 2. The MINUS ALL, however, does not perform duplication elimination. For efficiency sake, they solve two different problems. It would be interesting to see how they perform on a much larger scale of data.

One last thing, how do the results differ?

MINUS:
1 	4/24/2013
 
MINUS ALL:
1 	4/24/2013
2 	4/24/2013

The MINUS ALL eliminated one of the ID 2 records, but since there were two ID 2's in the top table, one of those still made it through to the end result.

I hope this provided some more material for thinking through how Teradata processes, and helps you out in a future need.

Cheers
-Brad