$group is the aggregation pipeline's equivalent of SQL GROUP BY. It consumes all input documents and collapses them into one output document per unique _id value. Accumulators compute values across all documents in each group.
Basic Syntax
{
$group: {
_id: <expression>, // REQUIRED — the grouping key
<fieldName>: { <accumulator>: <expr> } // one or more accumulated fields
}
}
// Group orders by status, count and sum
db.orders.aggregate([
{
$group: {
_id: "$status",
orderCount: { $sum: 1 },
totalAmount: { $sum: "$amount" },
avgAmount: { $avg: "$amount" }
}
}
])
// Output — one document per distinct status value:
// { _id: "completed", orderCount: 142, totalAmount: 58240, avgAmount: 410.1 }
// { _id: "pending", orderCount: 31, totalAmount: 9800, avgAmount: 316.1 }
What Survives After $group
$group is destructive — only the _id field and explicitly accumulated fields survive in the output. All other fields from the input documents are gone.
// Input documents have: _id, name, status, amount, region, createdAt // After $group on status — only _id and declared accumulators remain { $group: { _id: "$status", count: { $sum: 1 } } } // Output: { _id: "completed", count: 142 } // Gone: name, amount, region, createdAt — no reference to originals
$group, the only way to reference original document fields is through accumulator outputs. If you need a scalar field from the source (like the document's name), use $first or $last to capture it during grouping. You cannot reach back to original fields after the stage is past.