How to set duplicate content in a dataset to null with esProc
The first two fields of a certain database table may have duplicate values, such as the first three records below.
Column_A |
Column_B |
Column_C |
1 |
AB |
amount1 |
1 |
AB |
amount2 |
1 |
AB |
amount3 |
2 |
OA |
amount4 |
3 |
OE |
amount5 |
3 |
OE |
amount6 |
4 |
DB |
amount7 |
Now we need to change all duplicate values to null. In other words, after grouping by the first two fields (equivalent to grouping by one of the fields), only the first record in the group remains unchanged, and the first two fields of other records are changed to null.
The calculation result is as follows:
Column_A |
Column_B |
Column_C |
1 |
AB |
amount1 |
amount2 |
||
amount3 |
||
2 |
OA |
amount4 |
3 |
OE |
amount5 |
amount6 |
||
4 |
DB |
amount7 |
After SQL grouping, it must aggregate immediately, and the grouped subsets cannot be kept for further calculation. SQL also does not have natural row numbers within the group, making the code difficult to write.
esProc provides a rich set of calculation functions that can keep grouped subsets for further calculation, with natural row numbers, including row numbers within the group:
https://try.esproc.com/splx?3Wp
A |
|
1 |
$select * from table_name.txt |
2 |
=A1.group(Column_A) |
3 |
=A2.run(~.(if(#!=1,Column_A=Column_B=null))) |
4 |
=A3.conj() |
A1: Load data.
A2: Group by the first field using the group function, but do not aggregate.
Each group is a set that can be expanded by clicking, as shown in the first two groups.
A3: Use the run function to modify the data of each group. When the member’s index in the group is greater than 1, change the first two fields to null. ~ represents the current group, # represents the sequence number within the group.
A4: Merge the groups.
The step-by-step code above is beneficial for learning and debugging. Once proficient, A2-A4 can be combined into one statement:
=A1.group(Column_A).run(~.(if(#!=1,Column_A=Column_B=null))).conj()
SPL Official Website 👉 https://www.esproc.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.esproc.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL