Learn performance optimization skills from TPCH tests - Q20
I. Query Requirement
Q20 is to identify suppliers in a given nation that can provide a more competitive price of a particular part in a certain year. The so-called more competitive suppliers refer to those suppliers who have an excess of a given part available; an excess is defined to be more than 50% of the certain parts that the supplier shipped for a given nation in a given year.
Q20 is characterized as two tables join operations with sorting, aggregation, IN sub-query and ordinary sub-query.
II. Oracle Execution
The query SQL written in Oracle is as follows:
    select  /*+ parallel(n) */
	s_name,s_address
    from
	supplier,nation
    where
	s_suppkey in (
	    select
		ps_suppkey
	    from
		partsupp
	    where
		ps_partkey in (
		    select
			p_partkey
		    from
			part
		    where
			p_name like 'bisque%'
		)
		and ps_availqty > (
		    select
			0.5 * sum(l_quantity)
		    from
			lineitem
		    where
			l_partkey = ps_partkey
			and l_suppkey = ps_suppkey
			and l_shipdate >= date '1995-01-01'
			and l_shipdate < date '1995-01-01' + interval '1' year
		)
	)
	and s_nationkey = n_nationkey
	and n_name = 'CHINA'
    order by
	s_name;
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
| Number of parallel | 1 | 2 | 4 | 8 | 12 | 
| Oracle | 437 | 307 | 242 | 201 | 175 | 
III. SPL Optimization
This query looks complex with many layers nested, mainly because SQL does not advocate step-by-step operation. If we divide it into several steps, it will be much clearer:
  1. Filter the NATION table according to conditions
  2. Filter the PART table according to conditions
  3. Match and filter the SUPPLIER table with the result of 1 as a foreign key table
  4. Match and filter the PARTSUPP table with the results of 2 and 3 as a foreign key table 
   5. Use the result of 4 as a foreign key table to be joined on the LINEITEM table, and select the PS_AVAILQTY field, then group and aggregate by L_PARTKEY and L_SUPPKEY. As mentioned earlier, after grouping, the sub-query associated with the primary table with equivalent conditions can be rewritten as a joint statement with the primary table. Select L_SUPPKEY that satisfies the conditions after grouping and aggregation.
  6. Use the result of 5 as a foreign key table to filter the SUPPLIER table
The whole process is to generate intermediate foreign key tables to do matching and filtering repeatedly.
The SPL script is as follows:
| A | |
| 1 | =now() | 
| 2 | 1995-1-1 | 
| 3 | =elapse@y(A2,1) | 
| 4 | >partname="bisque" | 
| 5 | >nationname="CHINA" | 
| 6 | =file("nation.btx").import@b().select(N_NAME==nationname).derive@o().keys@i(N_NATIONKEY) | 
| 7 | =file("part.ctx").open().cursor@m(P_PARTKEY;pos@h(P_NAME,partname)).fetch().keys@im(P_PARTKEY) | 
| 8 | =file("supplier.ctx").open().cursor@m(S_SUPPKEY,S_NAME,S_ADDRESS;A6.find(S_NATIONKEY)).fetch().keys@im(S_SUPPKEY) | 
| 9 | =file("partsupp.ctx").open().cursor@m(PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY;A7.find(PS_PARTKEY),A8.find(PS_SUPPKEY)).fetch().keys@im(PS_PARTKEY,PS_SUPPKEY) | 
| 10 | =file("lineitem.ctx").open().cursor@m(L_PARTKEY,L_SUPPKEY,L_QUANTITY;L_SHIPDATE>=A2 && L_SHIPDATE<A3) | 
| 11 | =A10.join@i(L_PARTKEY:L_SUPPKEY,A9:PS_PARTKEY:PS_SUPPKEY,PS_AVAILQTY) | 
| 12 | =A11.groups@u(L_PARTKEY,L_SUPPKEY,PS_AVAILQTY;sum(L_QUANTITY):quantity) | 
| 13 | =A12.select(PS_AVAILQTY*2>quantity).id(L_SUPPKEY) | 
| 14 | =A8.switch@i(S_SUPPKEY,A13) | 
| 15 | =A14.new(S_NAME,S_ADDRESS).sort@o(S_NAME) | 
| 16 | return interval@ms(A1,now()) | 
Script execution time, Unit: seconds
| Number of parallel | 1 | 2 | 4 | 8 | 12 | 
| Oracle | 437 | 307 | 242 | 201 | 175 | 
| SPL composite table | 128 | 65 | 36 | 21 | 15 | 
SPL Official Website 👉 https://www.esproc.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.esproc.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/sxd59A8F2W
Youtube 👉 https://www.youtube.com/@esProc_SPL
 
            
        