Learn performance optimization skills from TPCH tests - Q11
I Query Requirement
Q11 queries the value of parts supplied by a certain country in the inventory.
Q11 is characterized by multi-table join query operation with grouping, sorting, aggregation and sub-query operation. Subqueries are located in the HAVING condition of grouping operations.
II Oracle Execution
The query SQL written in Oracle is as follows:
select /*+ parallel(n) */
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
group by
ps_partkey
having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.000001
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
)
order by
value desc;
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
Number of parallel | 1 | 2 | 4 | 8 | 12 |
Oracle | 71 | 52 | 43 | 36 | 33 |
III SPL Optimization
Let's regard the following sub-query as view V:
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
Then the original primary query equals to
select
ps_partkey,
value
from V
where value>0.000001*(select sum(value) from V)
This V is already a grouped result with a relative small amount, so the calculation of traversing V is much less than that of traversing partsuppand directly.
And we also know that the partsupp table is ordered by the primary key ps_partkey and ps_suppkey, that is, ordered to ps_partkey. The ordered grouping method can be used here to group by this field to improve the performance of computing V.
The SPL script is as follows:
A | |
1 | =now() |
2 | >name="CHINA" |
3 | >percent=0.000001 |
4 | =file("nation.btx").import@b().select(N_NAME== name).derive@o().keys@i(N_NATIONKEY) |
5 | =file("supplier.ctx").open().cursor@m(S_SUPPKEY;A4.find(S_NATIONKEY)).fetch().keys@i(S_SUPPKEY) |
6 | =file("partsupp.ctx").open().cursor@m(PS_PARTKEY,PS_AVAILQTY,PS_SUPPLYCOST;A5.find(PS_SUPPKEY)) |
7 | =A6.groups@o(PS_PARTKEY;sum(PS_SUPPLYCOST*PS_AVAILQTY):value) |
8 | =A7.sum(value)*percent |
9 | =A7.select(value>A8).sort@z(value) |
10 | return interval@ms(A1,now()) |
A7 performs ordered grouping with groups@o, which is equivalent to calculating view V. Then A8 and A9 traverse A7 twice to calculate the result.
Script execution time, Unit: seconds
Number of parallel | 1 | 2 | 4 | 8 | 12 |
Oracle | 71 | 52 | 43 | 36 | 33 |
SPL composite table | 24 | 15 | 9 | 6 | 5 |
SPL Official Website 👉 https://www.esproc.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.esproc.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/sxd59A8F2W
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese Version