From SQL to SPL: Summarize row counts by dynamic columns into json

Each column of database table _temp_data can be viewed as an array, with duplicate members:

hobbies_1

hobbies_2

A

E

B

F

A

A

A

F

Now we need to group and count each column separately, in JSON format:

hobbies_1

hobbies_2

{"A":3,"B":1}

{"A":1,"E":1,"F":2}

SQL:
With Grouped AS (
    SELECT
      td.hobbies_1,
      td.hobbies_2,
      COUNT(*) AS count
    FROM _temp_data td
    GROUP BY GROUPING SETS (
      (td.hobbies_1),
      (td.hobbies_2)
    )
)
SELECT
  jsonb_object_agg(g.hobbies_1, g.count) FILTER (WHERE g.hobbies_1 IS NOT NULL) AS hobbies_1,
  jsonb_object_agg(g.hobbies_2, g.count) FILTER (WHERE g.hobbies_2 IS NOT NULL) AS hobbies_2
FROM Grouped g;

First uses GROUPING SETS to group and count different columns simultaneously, and then uses jsonb_object_agg to convert each record set into JSON separately. The code is quite cumbersome; And the column names must be written, which is less flexible; If you want to support dynamic column names, you need to use stored procedures, and the structure will become complex. SPL does not need to write column names: https://try.esproc.com/splx?3mF


 A

1

$select * from _temp_data.txt

2

=E@bp(A1).(json(E@p(E@b(~.groups(~;count(~))))(1)))

3

=A1.create().record(A2)

A1 Load data.

A2 Transpose the two-dimensional table A1 into a sequence of sequences, group and count each small sequence (corresponding to each column), and then convert it into Json. Function E is used to convert between sequences and table sequences. @p represents the transpose of the two-layer sequence, @b represents removing the column name. When converting a set of records to JSON, there is an extra [] symbol compared with converting a record. Here, (1) is used to take the unique record and then convert it.

A3 Create a new empty two-dimensional table according to the A1 structure and fill in the Json sequence A2.

Question sourcehttps://stackoverflow.com/questions/78184303/how-to-transform-and-summarize-row-counts