A Field Guide to Querying and Report Computations with SPL - 3 Order-based neighboring member reference

 

This type of tasks involves accessing a member at another position or interval relative to the current member during a traversal operation or a loop operation. The action is the more complicated order-based computation. To do this, SQL uses a window function (lag/head) to get the member at a relative position, usually involving a subquery and resulting in complex code. When the relative position is in a grouped subset, the code becomes more difficult to write.

Example 1: Compute the maximum MoM increase

Source data: Order amounts of the twelve months of one year sorted in chronological order.

Target: Compute MoM increase for each month and get the maximum one.

Picture1png

SPL code:


A

1

$select month(OrderDate) as month,sum(Amount) as subTotal from Orders.txt where year(OrderDate)=2022 group by month(OrderDate)

2

=A1.(if(#>1,subTotal - subTotal[-1],0)).max()

A1: Load data.

Picture2png

A2=A1.(if(#>1,subTotal - subTotal[-1],0))… loops through each record of A1. During the process, if the current record isn’t first one, the current MoM increase = amount of the current record - amount of the preceding record; if the current is the first one, its MoM increase is 0. ~[-1].subTotal uses the syntax for referencing a member at a relative position to get amount of the preceding record, and is abbreviated as subTotal[-1].

Picture3png
 Learning point: Relative position
SPL uses the brackets to represent a relative position in the form of ~[n] or F[n].

Picture4png
~: The current member/record in a loop function.
F: A field of the current record in a loop function.
n: The sequence number relative to that of the current member. When n>0, it represents the nth member after the current one; when n<0, it represents the nth member before the current one; and when n=0, it is the current member and can be omitted.

…max() finds the maximum value and, in this example, returns 6094.9.

Picture5png

Example 2: Compute the cumulative total of large orders in each department

Source data: Orders table, Employee table

Target: Orders whose amounts are above 2000 are regarded as large ones, and those below the number are small orders. Here the cumulative total of large orders refers to, after records are grouped by department and each group is sorted by date, the cumulative value of the amounts starting from the first order or the small order directly after the preceding large order to the current large order. The specific task is to find every large order and their cumulative amount.

Picture6png

SPL code:


A

1

$select o.OrderId,o.Amount,o.OrderDate,e.Dept,null as SubTotal from Orders.txt o,Employee.txt e where o.SellerId=e.EId order by e.Dept,o.OrderDate

2

=A1.run(SubTotal+=if(Dept==Dept[-1] && Amount[-1]<=2000, SubTotal[-1]+Amount, Amount))

3

=A2.select(Amount>2000)

A1: Load data by creating association between Orders table and Department table, sorting records by department and date, and adding a new computed column named SubTotal that stores cumulative values and whose initial value is null.

Picture7png
A2: The run()function modifies each record of A1 by loop and returns the modified table. If department is the same and the preceding order is a small one, use SubTotal to compute the cumulative amount, which is SubTotal + the current order amount; otherwise (if the department is different or the preceding order is a large one), reset the SubTotal as the current order amount. [-1] represents the preceding record.

Picture8png

A3: Select records of large orders.

Example 3: Get orders whose total amount is above 5000 within an interval from the current record’s preceding one to its next one

Data source: Orders table, Department table

Target: Group records by department, sort each group by order date, based on each order get a set of records consisting of the current record’s preceding one, the current record, and the current’s next one, compute total amount in these records, and find records of orders based on which the total amount within the set is above 5000.

Picture9png

SPL code:


A

1

$select o.OrderId,o.Amount,o.OrderDate,e.Dept from Orders.txt o,Employee.txt e where o.SellerId=e.EId order by e.Dept,o.OrderDate

2

=A1.group(Dept)

3

=A2.(~.select(Amount[-1:1].sum()>5000))

4

=A3.conj()

A1: Load data, during which association between Orders table and Department table is established and records are sorted by department and order date.

A2: Group records without aggregation. Each group is a set. Below shows data in the first group:

Picture10png
A3: Process each group of data – perform filtering on the current group to get orders where the total amount exceeds 5000 within a set of consecutive records, which is defined by intervals of length 1 before and after the current record. Amount[-1:1] uses the syntax representing an interval of values at relative positions. Here it means a set of Amount field values from that of the current record’s preceding one to that of its next record.

Picture11png
 Learning point: Relative intervals
SPL uses ~[a:b] or F[a:b] to represent a set of members in a relative interval.

Picture12png
~: The current member/record in a loop function.
F: A field of the current record in a loop function.
a,b: The starting position and ending position of an interval. a is an integer; when it is a negative number, it represents a position counted backward. The rule also applies to b, and a<=b. When a is absent, the default starting position is the first record; when b is absent, the default is ending position is the last record. Note that though both ~[0:0] and ~[0] have one record, they have different data types; the former is a set of record, while the latter is a record.

A4: Concatenate members of all groups.

Picture13png

Extended reading

From SQL to SPL: Find the closest date match for each record from two tables

From SQL to SPL: Search for the closest matching record within the group

From SQL to SPL: Change duplicate contents to NULL

How to copy the field values within a subgroup in order to other subgroups with esProc

Distinctive Positional Operations

From SQL to SPL: Calculate a pair of minimum values that meet the criteria within the group

From SQL to SPL: Calculate based on the records within the group and fill the result into the first record