"2 Order- and position-based operations This type of tasks involves sequence number- and position .."

jinxing RaqForum 41 No.
4 View • 1 Months ago

A Field Guide to Querying and Report Computations with SPL - 2 Order- and position-based operations

2 Order- and position-based operations

This type of tasks involves sequence number- and position-based operations. SQL lacks inherent sequence numbers, making it generally necessary to artificially create sequence numbers before performing computations. This makes the code more complex to write, and difficulty further increases when dealing with order-based computations after grouping.

Example 1: Select one sample record every N orders

Source data: Orders table

Target: Sort Orders table by amount and find the 5^th, the 10^th, the 15^th… records.

Picture1png

SPL code:

	A
1	$select * from Orders.txt order by OrderDate
2	=A1.select(# % 5==0)

A2: Select records whose sequence numbers divisible by 5. The percent sign % gets the remainder of a division.

Example 2: Find the most recent client contact for each order before it was successfully signed

Source data: Order status table (2\OrderStatus22.txt)stores multiple statuses of each order on different dates, including client contact (contact), paused ordering (pause), signing the order (sign), and close of the order (close), among which the “sign” status only appears once.

Target: Among each order’s all contact status records, find the one closest to the signing of the order (sign) before the event.

Picture2png

SPL code:

	A
1	$select * from 2\OrderStatus22.txt order by EventDate
2	=A1.group(OrderID)
3	=A2.(~.select@c(Status!="sign").select@z1(Status=="contact"))

A1: Load data while sorting it by date.

A2: Group records by order without aggregation. Each group consist of a set.

Picture3png
A3=A2.(~.select@c(Status!=“sign”)…) processes each group of data: perform filtering operation on the current group, which retrieves members from the first continuously until the member where the status is “sign”.

Picture4png

@c is an option of select() function. It is used to retrieve members starting from the first member based on their positions, continuing until a member that does not satisfy the condition appears.

l Learning point: Function options and position-based operations

Many functions in SPL work with options to extend their functionalities. Position-based member selection is a commonly used functionality for, such as, retrieving the first member, reversing the order of members, getting the first continuous range of members, and so on. Take select() function as an example:

Picture5png
@1: Retrieve the first member from the result set.
@z: Retrieve the last member from the result set.
@c: Retrieve members from the first continuously until the first ineligible member (one that does not satisfy the condition) appears.

…. select@z(Status==“contact”) proceeds to process each group of data: perform filtering to get records where Status value is “close”, and reverse their order.

Picture6png
…@1 continues to process each group of data: retrieve the first member.

Picture7png

Example 3: Apply trial product distribution plan to all clients of each salesperson

Source data: The trial product plan table (TrialProductPlan.txt) is used for reporting, where salesperson and client act as the grouping fields while trial product and distribution time are details. For one salesperson, they need to send each trial product they are responsible for to every one of their clients. Therefore, the number of detail records under each salesperson is the same, though it may vary for different salespeople. Sort records by the first three fields, fill in the scheduled distribution dates of the trial products in the details of the last client under each salesperson, and the plan is finalized.

Target: To apply trial product distribution plan to all clients of each salesperson means to update or copy the distribution dates for each salesperson’s last client to the all its other clients in order.

Picture8png

SPL code:

	A
1	$select * from TrialProductPlan.txt order by SellerId,Client,TrialProduct
2	=A1.group(SellerId).(~.group(Client))
3	=A2.(last=~.m(-1).(PlanDate),~.(~.(PlanDate=last(#))))
4	return A1

A1: Load data while perform sorting by the first three fields.

Picture9png
A2: Group records by salesperson, and perform the second grouping on each group by client without aggregation. Each group or subgroup is a set. The group() function performs grouping operation, where ~ represents the current group. Click record of the first salesperson and the details are as follows:

Picture10png
A3=A2.(last=~.m(-1).(PlanDate), …) processes each group by loop: retrieve the sequence of PlanDate field values from the current group’s last subgroup, and name it “last”. Below is sequence “last” of the first group:

Picture11png

The m() function gets members according to their sequence numbers. -1 represents the last member.

l Learning point: m() function

The m() function offers multiple functionalities for retrieving members at the specified positions in a set, and forms a new set. The function’s complete syntax is A.m(a:b,c,d:e), where A is a set of length n and where -n<=a<=n and a isn’t equal to 0. When 1<=a<=n, a is the a^th member; and when -n<=a<=-1, a is the a^th member from the bottom. The rule is same for interpreting the value of b, c, d, and e. The syntax is often used to retrieve members of a sequence from back to front. In the sequence, a and c must be placed to the left of b and e respectively. When the parameter to the left of the colon is absent, the corresponding value is by default 1; and when the parameter to the right of the colon is absent, the corresponding value is by default -1.

Picture12png
(…, _.(.(PlanDate =last(#)))) proceeds to process each record of every subgroup in the current group by updating each PlanDate value to the value at the corresponding position in sequence “last”. “~.(PlanDate)” retrieves a sequence of field values according to client; “last(#)”, the abbreviation of “last.m(#)”, retrieves a member by the sequence number; and the number sign # represents the current sequence number. Once executed, A1 is updated, and its data is shown below:

Picture13png

The above code modifies each subgroup in each group. Actually there is no need to update the last subgroup, which means the m()function just needs to get subgroups starting from the first to the second to last. The code is ~.m(1:-2), which can also be written as ~.m(:-2).

A4: Return the updated A1.