How to retrieve the start time of the next group from the event table with esProc

After sorting the event table of a certain database table by timestamp, adjacent value fields may sometimes be continuously identical.

id

value

timestamp

1

1

2023-11-10 13:00:00

2

2

2023-11-11 13:00:00

3

2

2023-11-12 13:00:00

4

1

2023-11-13 13:00:00

5

1

2023-11-14 13:00:00

6

1

2023-11-15 13:00:00

7

2

2023-11-16 13:00:00

8

2

2023-11-17 13:00:00

9

1

2023-11-18 13:00:00

Now we need to group adjacent records with the same value, extract the start time of this group and the start time of the next group as the start and end time of this group, and form a new two-dimensional table. The starting time for the next group of the last group is agreed to be "9999-12-31 00:00:00".

id

value

effective_from

effective_to

1

1

2023-11-10 13:00:00

2023-11-11 13:00:00

2

2

2023-11-11 13:00:00

2023-11-13 13:00:00

4

1

2023-11-13 13:00:00

2023-11-16 13:00:00

7

2

2023-11-16 13:00:00

2023-11-18 13:00:00

9

1

2023-11-18 13:00:00

9999-12-31 00:00:00

SQL does not directly support grouping same adjacent values into one group, and is difficult to retain the groups for further calculation, and the indirectly implemented code is very complex. SPL supports grouping adjacent data, and can retain the grouped subsets for further calculation:

https://try.esproc.com/splx?3st


 A

1

$select * from equipments_staging.csv order by timestamp

2

=A1.group@o(value)

3

=A2.new(id,value, timestamp:effective_from, ifn(~[1].timestamp,datetime("9999-12-31 00:00:00")):effective_to)

A1: Load data and sort by timestamp.

A2: Divide adjacent records with the same value into one group, with each group being a set. The function group is used for grouping, but not aggregate. By default, it compares the values of the entire column, that is, equivalence grouping. @o represents comparing adjacent values and belongs to ordered grouping. The first three groups are shown in the figure:

Picture1png

A3: Create a new two-dimensional table and process each group of data in A2 into one new record. The id, value, and efficient_from are taken from the first record in the current group, and efficient_to is taken from the first record in the next group. When efficient_to is null, the value is 9999-12-31 00:00:00.

The complete code for effective_to is ~[1](1).timestamp, abbreviated as ~[1].timestamp, where ~ represents the current group and can be omitted when used alone, but cannot be omitted when expressing relative position and other meanings; [1] indicate the next group in relative position; (1) represents the first member and can be omitted.

The function ifn takes the first non-null member from the parameter.

Picture2png