From SQL to SPL: Getting the available ranges from two tables
There are two tables in a certain database. The original inventory table data_add stores multiple batches of inventory for multiple items. Each batch of inventory has a starting number START_NUM and an ending number END_NUM, representing the range of consecutive numbers.
ID |
ITEM_ID |
START_NUM |
END_NUM |
1 |
337 |
101 |
400 |
2 |
337 |
500 |
800 |
3 |
337 |
801 |
1200 |
4 |
337 |
1500 |
1600 |
5 |
337 |
15000 |
16000 |
6 |
337 |
20000 |
30000 |
7 |
444 |
20 |
30 |
The consumption table data_cons stores multiple batches of consumption for multiple items, each batch of consumption also has a continuous number range.
ID |
ITEM_ID |
START_NUM |
END_NUM |
1 |
337 |
240 |
300 |
2 |
337 |
301 |
400 |
3 |
337 |
850 |
1100 |
4 |
337 |
1500 |
1510 |
5 |
337 |
15000 |
16000 |
Now we need to calculate the current inventory of each item, which is the original inventory minus consumption. Note that each batch is calculated separately, and the two consecutive batches are also calculated separately, such as the second and third records; The continuous numbering range of the original inventory may be consumed into discontinuous multi segment numbering, in which case multiple records need to be generated, with each record corresponding to a continuous numbering segment.
ITEM_ID |
START_NUM |
END_NUM |
337 |
101 |
239 |
337 |
500 |
800 |
337 |
801 |
849 |
337 |
1101 |
1200 |
337 |
1511 |
1600 |
337 |
20000 |
30000 |
444 |
20 |
30 |
with merged_cons(item_id, start_num, end_num) AS (
SELECT * FROM data_cons
MATCH_RECOGNIZE (
PARTITION BY item_id
ORDER BY start_num, end_num
MEASURES FIRST(start_num) AS start_num, LAST(end_num) AS end_num
PATTERN( merged* strt )
DEFINE
merged AS MAX(end_num) + 1 >= NEXT(start_num)
)
)
, intersections(id, item_id, start_before, end_before, start_after, end_after) AS (
SELECT a.id, a.item_id, /*a.start_num AS add_start, a.end_num AS add_end,
b.start_num AS cons_start, b.end_num AS cons_end, */
CASE WHEN a.start_num < b.start_num - 1 THEN a.start_num END AS start_before,
CASE WHEN a.start_num < b.start_num - 1 THEN b.start_num - 1 END AS end_before,
CASE WHEN b.end_num + 1 < a.end_num THEN b.end_num + 1 END AS start_after,
CASE WHEN b.end_num + 1 < a.end_num THEN a.end_num END AS end_after
FROM data_add a
JOIN merged_cons b
ON a.item_id = b.item_id AND LEAST(a.end_num, b.end_num) >= GREATEST(a.start_num, b.start_num)
)
SELECT item_id, start_before as start_num, end_before as end_num
FROM intersections WHERE start_before IS NOT NULL
UNION ALL
SELECT item_id, start_after as start_num, end_after as end_num
FROM intersections WHERE start_after IS NOT NULL
UNION ALL
SELECT item_id, start_num, end_num
FROM data_add d
WHERE NOT EXISTS(SELECT 1 FROM intersections i WHERE i.id = d.id);
SQL requires the use of MATCH_RECOGNIZE statements and multiple subqueries to indirectly implement set calculation, which is complex in code and difficult to understand. SPL can use variables to represent sets and directly perform set related calculations:
https://try.esproc.com/splx?3xM
A |
B |
|
1 |
$select * from data_add.txt |
$select * from data_cons.txt |
2 |
=A1.group(ITEM_ID;~.(to(START_NUM,END_NUM)):a) |
=B1.group(ITEM_ID;~.conj(to(START_NUM,END_NUM)):b) |
3 |
=A2.join(ITEM_ID,B2,b) |
|
4 |
=A3.derive(a.([~, b].merge@d()).(~.group@i(~!=~[-1]+1)).conj():diff) |
|
5 |
=A4.news(diff; ITEM_ID, ~1:START_NUM, ~.m(-1):END_NUM) |
A1-B1: Load data.
A2: Group the original inventory by item and convert each batch within the group into a continuous sequence small set, without aggregation. B2: Perform similar calculation to the consumption table and merge multiple small sets within the group into one large set. ~ indicates the current group or current member.
A3: Left join.
A4: Add calculated column diff: Calculate the difference set between each small set of the original inventory and the large set of the consumption table, group each difference set by condition, and assign the consecutive sequence to the same group. The function group is used for grouping, and by default is equivalence grouping, @i represents grouping by condition, and ~[-1] represents the previous member. Function merge@d merges ordered sets and calculate the difference set.
A5: Generate a record for each group in the diff field of each A4 record. The function news can expand a record into multiple records. ~.m(-1) represents the last member in the sequence ~.
Question source:https://stackoverflow.com/questions/78070892/getting-the-available-ranges-from-two-tables
SPL Official Website 👉 https://www.esproc.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.esproc.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL