"Problem description A table stores multiple time intervals for each account, and the intervals m .."

lisongbo RaqForum 82 No.
1 Reply • 5 View • 1 Days ago

SQLazy: Merging Overlapping Time Intervals

Problem description

A table stores multiple time intervals for each account, and the intervals may overlap.

Source data:

account_id	start_date	end_date
A	2019-06-20	2019-06-29
A	2019-06-25	2019-07-25
A	2019-07-20	2019-08-26
A	2019-12-25	2020-01-25
A	2021-04-27	2021-07-27
A	2021-06-25	2021-07-14
A	2021-07-10	2021-08-14
A	2021-09-10	2021-11-12
B	2019-07-13	2020-07-14
B	2019-06-25	2019-08-26

Expected output (merge overlapping intervals for each account):

account_id	start_date	end_date
A	2019-06-20	2019-08-26
A	2019-12-25	2020-01-25
A	2021-04-27	2021-08-14
A	2021-09-10	2021-11-12
B	2019-06-25	2020-07-14

Take account A as an example:

The first three intervals (6/20-6/29, 6/25-7/25, 7/20-8/26) overlap and can be merged into 6/20 - 8/26;
The nonoverlapping interval 12/25 - 1/25;
The overlapping intervals (4/27-7/27, 6/25-7/14, 7/10-8/14) can be merged into 4/27-8/14;
The nonoverlapping interval 9/10-11/12.

Apply the same merging operations to account B: merge any intervals that overlap.

Step-by-step implementation with SQLazy

Key approach: Check whether the current interval’s start date is later than the maximum end date among all previous intervals. If it is, the current interval does not overlap with any of the previous intervals and a new group should be started; otherwise, merge it into the current group.

Name	Anchor	Statement
t1	acc	sort account_id, start_date
t2		compute end_date[:-1] max as prev_max; partition account_id
t3		segment condition start_date > prev_max as gid; partition account_id
t4		summarize start_date min as start_date, end_date max as end_date; group account_id, gid
		derive delete gid

[Run this example online]

Step 1: Sort rows by account and the start date

sort account_id, start_date

Sort rows by account_id and start_date in ascending order, ensuring intervals within each account are processed in chronological order.

Step 2: Calculate the maximum end date among all rows before the current row

compute end_date[:-1] max as prev_max; partition account_id

Within each account partition, calculate the maximum end_date value among all rows from the first row up to, but excluding, the current row, and record it as prev_max.

Step 3: Segment rows based on the condition and assign group numbers

segment condition start_date > prev_max as gid; partition account_id

Check each row in sequence: if start_date > prev_max, the current interval does not overlap with any of the previous intervals and a new group is started (gid+1); otherwise, the row is assigned to the current group.

Step 4: Aggregate by account and group number

summarize start_date min as start_date, end_date max as end_date; group account_id, gid

For each group, extract the earliest start date and the latest end date to form a merged interval.

Finally, remove the helper column gid.

Compile the steps into SQL

Once the above steps are complete and verified, SQLazy’s compiler can automatically generate the equivalent native SQL (using MySQL as an example):

WITH t2 AS (
    SELECT
        account_id,
        start_date,
        end_date,
        MAX(end_date) OVER (
            PARTITION BY account_id
            ORDER BY
                CASE WHEN account_id IS NULL THEN 1 ELSE 0 END,
                account_id ASC,
                CASE WHEN start_date IS NULL THEN 1 ELSE 0 END,
                start_date ASC
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
        ) AS prev_max
    FROM
        acc
),
t3 AS (
    SELECT
        account_id,
        start_date,
        end_date,
        prev_max,
        1 + SUM(
            CASE
                WHEN start_date > prev_max THEN 1
                ELSE 0
            END
        ) OVER (
            PARTITION BY account_id
            ORDER BY
                CASE WHEN account_id IS NULL THEN 1 ELSE 0 END,
                account_id ASC,
                CASE WHEN start_date IS NULL THEN 1 ELSE 0 END,
                start_date ASC
        ) AS gid
    FROM
        t2
)
SELECT
    account_id,
    MIN(start_date) AS start_date,
    MAX(end_date) AS end_date
FROM
    t3
GROUP BY
    account_id,
    gid
ORDER BY
    account_id,
    start_date;

You only need to verify the logic of each of the four steps – no need to understand or debug the SQL – and the compiler will generate the production-ready code.

Why SQLazy is more efficient

	Conventional SQL (Manual coding)	SQLazy
Steps	Multi-level CTE + window functions	Four steps
Core technique	Need to understand “ROWS BETWEEN UNBOUNDED PRECEDING AND CASE WHEN CUMULATION”	Use [:-1] directly to represent “all preceding rows”, and use “segment” to represent the grouping condition
Readability	Low – nested window functions are difficult to understand	High – each step corresponds to a well-defined business action
Debug approach	Manually break down CTEs and run them repeatedly	Step-by-step execution with visible intermediate results at each step
Cross-database portability	Manually adapt code for different SQL dialects	The compiler automatically generates code in the target SQL dialect

SQLazy lets you describe logic using business language, rather than writing nested queries in SQL syntax. This example of “merge overlapping intervals” is expressed clearly in just 4 steps – sort, calculate the maximum end date among the preceding rows, segment rows by condition, and aggregate. The compiler generates the final SQL for you, and you only need to verify the business meaning of each step.

Try SQLazy online: sqlazy.com (Free to use, signup not required)
SQLazy project repository: github.com/SPLWare/SQLazy

SPL Official Website 👉 https://www.esproc.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.esproc.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/sxd59A8F2W

Youtube 👉 https://www.youtube.com/@esProc_SPL

Promote

lisongbo • 5 View • 1 Days ago

SQLazy: Merging Overlapping Time Intervals

Problem description

Step-by-step implementation with SQLazy

ToC