SPL assists MongoDB: Only keep the running total for the last item in the partition

The data of a collection (named grp_core) in the Mongdb database is as follows:

[
  { "grp": "A", "seq": 1, "score": 1, x: 0 },
  { "grp": "A", "seq": 2, "score": 3, x: 0 },
  { "grp": "A", "seq": 3, "score": 2, x: 0 },
  { "grp": "A", "seq": 4, "score": 4, x: 0 }
  { "grp": "B", "seq": 1, "score": 5, x: 0 },
  { "grp": "B", "seq": 2, "score": 2, x: 0 },
  { "grp": "B", "seq": 3, "score": 4, x: 0 },
  { "grp": "B", "seq": 4, "score": 3, x: 0 }

]

Requirement: Group by GRP, sort by seq within the group, sum the score fields of each group, and write the sum result into the x field in the last row of the group.

Expected result:

[
  { "grp": "A", "seq": 1, "score": 1, x: 0 },
  { "grp": "A", "seq": 2, "score": 3, x: 0 },
  { "grp": "A", "seq": 3, "score": 2, x: 0 },
  { "grp": "A", "seq": 4, "score": 4, x: 10 }
  { "grp": "B", "seq": 1, "score": 5, x: 0 },
  { "grp": "B", "seq": 2, "score": 2, x: 0 },
  { "grp": "B", "seq": 3, "score": 4, x: 0 },
  { "grp": "B", "seq": 4, "score": 3, x: 14 }
]

When performing group summation in MongoDB queries, the values of each row need to be written, and then the values of the preceding rows can be assigned 0 row by row. Since we don’t know exactly how many rows each group has, we can only sort them in reverse order and keep the value of the first row. To implement the results exactly as what we expected, it is necessary to reverse the order again after completing these tasks. Due to the length of the code, we won’t do the last reversing here. The reference code is as follows:

db.collection.aggregate([
  {
    "$setWindowFields": {
      partitionBy: "$grp",
      sortBy: {
        seq: 1
      },
      output: {
        x: {
          $sum: "$score",
          window: {
            documents: [
              "unbounded",
              "current"
            ]
          }
        },
        ordering: {
          $documentNumber: {}
        }
      }
    }
  },
  {
    "$setWindowFields": {
      "partitionBy": "$grp",
      "sortBy": {
        "ordering": -1
      },
      "output": {
        "rank": {
          "$rank": {}
        }
      }
    }
  },
  {
    "$set": {
      "ordering": "$$REMOVE",
      "rank": "$$REMOVE",
      "x": {
        "$cond": {
          "if": {
            $eq: [
              1,
              "$rank"
            ]
          },
          "then": "$x",
          "else": 0
        }
      }
    }
  }
])

SPL not only supports retaining the grouping sets during grouping, but also the grouping sets are ordered, and each row can be accessed by sequence number or in reverse order. Therefore, it is easy to write the sum of each group into the x field of the last row with just one line of code:



A

1

=mongo_open@d("mongodb://127.0.0.1:27017/local")

2

=mongo_shell@d(A1, "{'find':'grp_score','projection':{'_id':0}}")

3

=mongo_close(A1)

4

=A2.group(grp;(a=~.sort(seq),a.m(-1).x=a.sum(score)))

5

=json(A2)

The first three lines are readings, the fourth line performs calculation(where m(-1) represents the last row), and the fifth line converts the result back to JSON format.


Question source: https://stackoverflow.com/questions/78552887/mongodb-aggregation-only-keep-the-running-total-for-the-last-item-in-the-partit