From SQL to SPL: Aggregate according to time interval
A certain database table stores time-series data, with intervals of several seconds between each record.
dt | payload |
2024-01-01 12:00:13 | 0 |
2024-01-01 12:00:28 | 9 |
2024-01-01 12:00:39 | 2 |
2024-01-01 12:00:44 | 3 |
2024-01-01 12:00:53 | 0 |
2024-01-01 12:01:00 | 6 |
2024-01-01 12:01:19 | 0 |
2024-01-01 12:01:26 | 4 |
2024-01-01 12:01:27 | 2 |
2024-01-01 12:01:49 | 6 |
2024-01-01 12:02:09 | 5 |
2024-01-01 12:02:32 | 6 |
2024-01-01 12:02:59 | 7 |
2024-01-01 12:03:18 | 5 |
2024-01-01 12:03:25 | 4 |
2024-01-01 12:03:28 | 5 |
2024-01-01 12:03:30 | 3 |
2024-01-01 12:03:41 | 1 |
2024-01-01 12:04:23 | 9 |
2024-01-01 12:04:32 | 7 |
2024-01-01 12:04:34 | 5 |
2024-01-01 12:04:43 | 4 |
2024-01-01 12:04:56 | 7 |
2024-01-01 12:05:18 | 7 |
2024-01-01 12:05:33 | 3 |
2024-01-01 12:05:39 | 5 |
2024-01-01 12:05:54 | 1 |
2024-01-01 12:05:56 | 8 |
2024-01-01 12:06:11 | 7 |
2024-01-01 12:06:40 | 2 |
2024-01-01 12:06:47 | 5 |
2024-01-01 12:06:51 | 6 |
2024-01-01 12:06:58 | 5 |
2024-01-01 12:07:06 | 3 |
2024-01-01 12:07:10 | 5 |
2024-01-01 12:07:27 | 0 |
2024-01-01 12:07:33 | 1 |
2024-01-01 12:07:38 | 3 |
2024-01-01 12:08:04 | 2 |
2024-01-01 12:08:11 | 4 |
2024-01-01 12:08:21 | 1 |
2024-01-01 12:08:44 | 1 |
2024-01-01 12:08:55 | 3 |
2024-01-01 12:09:01 | 4 |
2024-01-01 12:09:05 | 0 |
2024-01-01 12:09:12 | 2 |
2024-01-01 12:09:28 | 8 |
2024-01-01 12:09:38 | 5 |
2024-01-01 12:10:14 | 3 |
2024-01-01 12:10:25 | 2 |
2024-01-01 12:10:30 | 7 |
2024-01-01 12:10:39 | 1 |
2024-01-01 12:10:53 | 6 |
2024-01-01 12:11:15 | 5 |
2024-01-01 12:11:25 | 6 |
2024-01-01 12:11:41 | 1 |
2024-01-01 12:11:53 | 8 |
2024-01-01 12:11:59 | 5 |
2024-01-01 12:12:03 | 9 |
2024-01-01 12:12:09 | 5 |
2024-01-01 12:12:28 | 8 |
2024-01-01 12:12:34 | 2 |
2024-01-01 12:12:53 | 1 |
2024-01-01 12:13:13 | 3 |
2024-01-01 12:13:14 | 8 |
2024-01-01 12:13:19 | 8 |
2024-01-01 12:13:48 | 3 |
2024-01-01 12:14:06 | 5 |
2024-01-01 12:14:18 | 1 |
2024-01-01 12:14:36 | 3 |
2024-01-01 12:14:40 | 4 |
2024-01-01 12:14:55 | 3 |
Now we need to do a group and aggregation every minute, summarizing data for 5 minutes each time. For example, generate three records in minutes 1, 2, and 3, and summarize the data for minutes 1-5, 2-6, and 3-7 respectively.
From | To | payload |
2024-01-01 12:00:00 | 2024-01-01 12:05:00 | 100 |
2024-01-01 12:01:00 | 2024-01-01 12:06:00 | 110 |
2024-01-01 12:02:00 | 2024-01-01 12:07:00 | 117 |
2024-01-01 12:03:00 | 2024-01-01 12:08:00 | 111 |
2024-01-01 12:04:00 | 2024-01-01 12:09:00 | 104 |
2024-01-01 12:05:00 | 2024-01-01 12:10:00 | 91 |
2024-01-01 12:06:00 | 2024-01-01 12:11:00 | 86 |
2024-01-01 12:07:00 | 2024-01-01 12:12:00 | 86 |
2024-01-01 12:08:00 | 2024-01-01 12:13:00 | 99 |
2024-01-01 12:09:00 | 2024-01-01 12:14:00 | 110 |
2024-01-01 12:10:00 | 2024-01-01 12:15:00 | 107 |
SQL:
SELECT
[From], DATEADD(MINUTE, 1, [To]) [To], payload
FROM (
SELECT
dt, MIN(dt) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) [From],
dt [To], SUM(payload) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) payload
FROM (
SELECT
DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0) dt,
SUM(payload) payload
FROM #tmstmp
GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
) q
) q
WHERE DATEDIFF(MINUTE, [From], [To]) > 3
SQL needs to implement it using nested subqueries and multiple window functions, which makes the code cumbersome. SPL provides syntax for directly accessing positions.
https://try.esproc.com/splx?4ql
A | |
1 | $select * from tmstmp.txt |
2 | =A1.groups(datetime@m(dt):dt;sum(payload):payload) |
3 | =A2.new(dt:From, elapse@s(From,300):To, payload[0:4].sum(): payload) |
4 | =A3.m(:-5) |
A1: Load data.
A2: Group and aggregate by minute.
A3: Generate a new two-dimensional table, where To is taken from the current record, From is taken from 5 minutes after the current record, and the payload summarizes the interval from the current record to the 4th record.
A4: Take from the first item backward to the fifth item.
Question source:https://stackoverflow.com/questions/78152707/get-sum-for-each-5-minute-time-interval
SPL Official Website 👉 https://www.esproc.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.esproc.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/sxd59A8F2W
Youtube 👉 https://www.youtube.com/@esProc_SPL