MongoDB Group aggregation in Java

In this post, we will learn the easiest way to use MongoDB $group aggregation in Java. First, we will learn to use it in MongoDB itself then try to use that aggregation pipeline in Java.

Let’s analyze a MongoDB document in a reports collection which is given below:

{
  "_id": {
    "$oid": "6315b348d2d8a2744558f444"
  },
  "headerMessage": "This is testing Header Message.",
  "job": {
    "jobId": "Z0-008",
    "jobName": "2020-RC Sick"
  },
  "reportId": 1,
  "lastUpdatedDateTime": {
    "$date": "2022-09-05T08:35:56.983Z"
  },
  "formDetail": {
    "formId": "6315ad6bd2d8a2744558f441",
    "formTitle": "New Test 1"
  }
}

There are many documents in reports collection. We need to aggregate it so that we can count the total reports per job and also display the job information. The expected output is given below:

{
  "_id": "Z0-008",
  "totalReports": 10,
  "job": {
    "jobId": "Z0-008",
    "jobName": "2020-RC Sick or Vacation"
  }
}

MongoDB $group aggregation example

The syntax of $group aggregation is:


{
  $group:
    {
      _id: <expression>, // Group key
      <field1>: { <accumulator1> : <expression1> },
      ...
    }
 }

Example:

[
  {
    "$group": {
      "_id": "$job.jobId",
      "totalReports": {
        "$sum": 1
      }
    }
  }
]

In the above example:

The value $job.jobId is the value by which we want to group report. We are using $ sign that is to use actual value of that field from the original document.
The totalReports is output field that holds the accumulated value that is increased by 1 as number of reports found in the list.
The operator $sum sums the value by 1 we’ve defined.

The output of the above query will be like below:

{
  "_id": "B-001A",
  "totalReports": 10
}

The $group operator done its job but this is not the output that we’ve expected.

To get the expected output we need to use $first operator.

The $first operator is an accumulator operator in MongoDB aggregation. The $first operator can be used as an array operator but we are not going to discuss it here. We will use $first opeartor as an accumulator operator in the aggregation pipeline in $group stage.

To use $first operator along with $group we can take the reference from the following query:

[
  {
    "$group": {
      "_id": "$job.jobId",
      "totalReports": {
        "$sum": 1
      },
      "job": {
        "$first": "$job"
      }
    }
  }
]

The $first is accumulator operator and $job is the reference object of the job field in the original document. If we don’t reference job field with the help of $ sign then it will display only job as value, not the whole object. Hence, the result of the above query will be the following:

{
  "_id": "Z0-008",
  "totalReports": 10,
  "job": {
    "jobId": "Z0-008",
    "jobName": "2020-RC Sick or Vacation"
  }
}

Let’s see the output of the query without referencing the job object.

Query:

[
  {
    "$group": {
      "_id": "$job.jobId",
      "totalReports": {
        "$sum": 1
      },
      "job": {
        "$first": "job"
      }
    }
  }
]

Output:

{
  "_id": "Z0-008",
  "totalReports": 10,
  "job": "job"
}

That is why, it is very important to understand the use of $ sign.

Implement $group aggregation in Java

To implement $group aggregation in Java, we must have a MongoDB Java driver in the classpath. By adding this driver to the classpath we can easily use its classes and interfaces that are required to connect to the MongoDB database and execute queries.

Let’s create an aggregation pipeline and add multiple stages one by one into the pipeline.

Creating aggregation pipeline list.

List<Bson> aggregationPipelines = new ArrayList<>();

Create a group stage of the aggregation pipeline by implementing the query we discussed earlier.

Bson groupStage = Aggregates.group("$job.jobId", Accumulators.sum("totalReports", 1),
				Accumulators.first("job", "$job"));

Now, add this group stage to our aggregationPipelines list.

aggregationPipelines.add(groupStage);

We can use this list of aggregationPipelines to execute aggregation using MongoCollection.

If you don’t know how to create a MongoCollection object then you can read our earlier post MongoCollection in Spring Boot which is based on Spring Boot but this will help if you are building a standalone application.

The final code will look like the below:

// Import packages
import org.bson.conversions.Bson;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.model.Accumulators;
import com.mongodb.client.model.Aggregates;
import com.mongodb.client.model.Filters;

//Create multiple stages like match, group, sort and etc then add these stages to aggregationPipelines which is simply a Bson objects

List<Bson> aggregationPipelines = new ArrayList<>();

Bson groupStage = Aggregates.group("$job.jobId", Accumulators.sum("totalReports", 1),
				Accumulators.first("job", "$job"));

// aggregationPipelines.add(matchStage); // add match stage if you have
aggregationPipelines.add(groupStage);
//aggregationPipelines.add(sortStage); // add sort stage if you have

// Execute aggregation pipeline with MongoCollection
reportDataCollection.aggregate(aggregationPipelines, MyOutput.class).into(outputList);

After executing an aggregate method from the MongoCollection interface it will add results to outputList object. Then we can use that object for further processing.

Conclusion

In this post, we learned to implement the MongoDB aggregation pipeline along with $group in Java.

MongoDB $group aggregation example

Implement $group aggregation in Java

Conclusion

Related Posts: