From SQL to SPL: Aggregate according to time interval

A certain database table stores time-series data, with intervals of several seconds between each record.

dt

payload

2024-01-01 12:00:13

0

2024-01-01 12:00:28

9

2024-01-01 12:00:39

2

2024-01-01 12:00:44

3

2024-01-01 12:00:53

0

2024-01-01 12:01:00

6

2024-01-01 12:01:19

0

2024-01-01 12:01:26

4

2024-01-01 12:01:27

2

2024-01-01 12:01:49

6

2024-01-01 12:02:09

5

2024-01-01 12:02:32

6

2024-01-01 12:02:59

7

2024-01-01 12:03:18

5

2024-01-01 12:03:25

4

2024-01-01 12:03:28

5

2024-01-01 12:03:30

3

2024-01-01 12:03:41

1

2024-01-01 12:04:23

9

2024-01-01 12:04:32

7

2024-01-01 12:04:34

5

2024-01-01 12:04:43

4

2024-01-01 12:04:56

7

2024-01-01 12:05:18

7

2024-01-01 12:05:33

3

2024-01-01 12:05:39

5

2024-01-01 12:05:54

1

2024-01-01 12:05:56

8

2024-01-01 12:06:11

7

2024-01-01 12:06:40

2

2024-01-01 12:06:47

5

2024-01-01 12:06:51

6

2024-01-01 12:06:58

5

2024-01-01 12:07:06

3

2024-01-01 12:07:10

5

2024-01-01 12:07:27

0

2024-01-01 12:07:33

1

2024-01-01 12:07:38

3

2024-01-01 12:08:04

2

2024-01-01 12:08:11

4

2024-01-01 12:08:21

1

2024-01-01 12:08:44

1

2024-01-01 12:08:55

3

2024-01-01 12:09:01

4

2024-01-01 12:09:05

0

2024-01-01 12:09:12

2

2024-01-01 12:09:28

8

2024-01-01 12:09:38

5

2024-01-01 12:10:14

3

2024-01-01 12:10:25

2

2024-01-01 12:10:30

7

2024-01-01 12:10:39

1

2024-01-01 12:10:53

6

2024-01-01 12:11:15

5

2024-01-01 12:11:25

6

2024-01-01 12:11:41

1

2024-01-01 12:11:53

8

2024-01-01 12:11:59

5

2024-01-01 12:12:03

9

2024-01-01 12:12:09

5

2024-01-01 12:12:28

8

2024-01-01 12:12:34

2

2024-01-01 12:12:53

1

2024-01-01 12:13:13

3

2024-01-01 12:13:14

8

2024-01-01 12:13:19

8

2024-01-01 12:13:48

3

2024-01-01 12:14:06

5

2024-01-01 12:14:18

1

2024-01-01 12:14:36

3

2024-01-01 12:14:40

4

2024-01-01 12:14:55

3

Now we need to do a group and aggregation every minute, summarizing data for 5 minutes each time. For example, generate three records in minutes 1, 2, and 3, and summarize the data for minutes 1-5, 2-6, and 3-7 respectively.

From

To

payload

2024-01-01 12:00:00

2024-01-01 12:05:00

100

2024-01-01 12:01:00

2024-01-01 12:06:00

110

2024-01-01 12:02:00

2024-01-01 12:07:00

117

2024-01-01 12:03:00

2024-01-01 12:08:00

111

2024-01-01 12:04:00

2024-01-01 12:09:00

104

2024-01-01 12:05:00

2024-01-01 12:10:00

91

2024-01-01 12:06:00

2024-01-01 12:11:00

86

2024-01-01 12:07:00

2024-01-01 12:12:00

86

2024-01-01 12:08:00

2024-01-01 12:13:00

99

2024-01-01 12:09:00

2024-01-01 12:14:00

110

2024-01-01 12:10:00

2024-01-01 12:15:00

107

SQL

SELECT
    [From], DATEADD(MINUTE, 1, [To]) [To], payload
FROM (
    SELECT
        dt, MIN(dt) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) [From],
        dt [To], SUM(payload) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) payload
    FROM (
        SELECT
            DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0) dt, 
SUM(payload) payload
        FROM #tmstmp
        GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
    ) q
) q
WHERE DATEDIFF(MINUTE, [From], [To]) > 3

SQL needs to implement it using nested subqueries and multiple window functions, which makes the code cumbersome. SPL provides syntax for directly accessing positions.

https://try.esproc.com/splx?4ql


 A

1

$select * from tmstmp.txt

2

=A1.groups(datetime@m(dt):dt;sum(payload):payload)

3

=A2.new(dt:From, elapse@s(From,300):To, payload[0:4].sum(): payload)

4

=A3.m(:-5)

A1 Load data.

A2 Group and aggregate by minute.

A3 Generate a new two-dimensional table, where To is taken from the current record, From is taken from 5 minutes after the current record, and the payload summarizes the interval from the current record to the 4th record.

A4 Take from the first item backward to the fifth item.

Question sourcehttps://stackoverflow.com/questions/78152707/get-sum-for-each-5-minute-time-interval