← back

$lookup
Collection Joins

FILE  20_aggregation_lookup
TOPIC  Standard Syntax · Correlated Subquery · let/$$vars · Index · Edge Cases
LEVEL  Foundation
01
What is $lookup
Left outer join between collections in the same DB
join

$lookup performs a left outer join from the current collection to another collection in the same database, attaching matched documents as an array field on each input document. It is the aggregation pipeline's equivalent of SQL's LEFT OUTER JOIN.

SQLMongoDB $lookup
LEFT OUTER JOIN orders ON users._id = orders.userId$lookup: { from:"orders", localField:"_id", foreignField:"userId", as:"orders" }
Inline columns from joined rowJoined docs added as an array field
NULL row if no matchEmpty array [] if no match
NOTE
The result of $lookup is always an array, regardless of how many documents match. Zero matches = []. One match = [{ ... }]. Follow with $unwind + preserveNullAndEmptyArrays: true to flatten a one-to-one join into an embedded object.
02
Standard Syntax
Simple equality join on a single field
syntax
{
  $lookup: {
    from:         "foreignCollection",  // target collection to join
    localField:   "localFieldName",     // field in the current collection
    foreignField: "foreignFieldName",   // matching field in the foreign collection
    as:           "outputArrayField"    // name of the new array field to add
  }
}

Basic Example — Orders with Customer Names

// orders: { _id, customerId, amount, status }
// customers: { _id, name, email }

db.orders.aggregate([
  { $match: { status: "completed" } },
  {
    $lookup: {
      from:         "customers",
      localField:   "customerId",    // orders.customerId
      foreignField: "_id",           // matches customers._id
      as:           "customerInfo"
    }
  }
])

// Result shape per order:
// {
//   _id: ..., customerId: "C001", amount: 150,
//   customerInfo: [ { _id:"C001", name:"Alice", email:"..." } ]
// }

When localField is an Array

// If localField is an array of IDs, $lookup matches each element
// orders: { tagIds: ["T1", "T2"] }
// tags: { _id, label }

db.orders.aggregate([
  { $lookup: { from:"tags", localField:"tagIds", foreignField:"_id", as:"tags" } }
])
// tags field = array of all tag docs whose _id is in tagIds
03
Advanced Syntax
Correlated subquery with let, pipeline, $$variables
advanced

The advanced form joins on multiple conditions and applies additional pipeline stages (filter, sort, limit) to the foreign collection before results are embedded. Requires MongoDB 3.6+.

{
  $lookup: {
    from: "foreignCollection",

    // Map local fields to variables for use inside the pipeline
    let: { varName: "$localField" },

    // Mini-pipeline that runs on the foreign collection
    pipeline: [
      {
        $match: {
          $expr: {
            $and: [
              { $eq: ["$foreignField", "$$varName"] }, // $$var = local var
              { $eq: ["$status", "active"] }           // additional condition
            ]
          }
        }
      },
      { $sort: { createdAt: -1 } },
      { $limit: 5 },
      { $project: { amount: 1, createdAt: 1, _id: 0 } }
    ],

    as: "outputArrayField"
  }
}

Variable Syntax Reference

NotationRefers To
"$field"Field in the foreign collection (inside the subquery pipeline)
"$$varName"Variable defined in the let block (from the local document)
"$$ROOT"The entire current document at this pipeline stage
"$$NOW"Current datetime (built-in system variable)
TIP
Always prefer the advanced pipeline form when you need to filter the joined data — it limits the array size before embedding, preventing 16MB BSON document limit violations. The standard form returns all matches with no size control.

Uncorrelated Subquery (no let)

// Runs once, embeds same result into every document (e.g., global config)
db.reports.aggregate([
  {
    $lookup: {
      from: "config",
      pipeline: [ { $match: { type: "global-settings" } } ],
      as: "settings"
    }
  }
])
04
Flattening with $unwind
Convert single-match array into embedded object
pattern

Since $lookup always returns an array, a one-to-one join results in a single-element array. Use $unwind + preserveNullAndEmptyArrays: true to flatten it into a direct embedded object.

// Without $unwind — customer is wrapped in an array
db.orders.aggregate([
  { $lookup: { from:"users", localField:"userId", foreignField:"_id", as:"user" } }
])
// { ..., user: [ { name: "Alice", ... } ] }  ← always an array

// With $unwind — direct embedded object
db.orders.aggregate([
  { $lookup: { from:"users", localField:"userId", foreignField:"_id", as:"user" } },
  {
    $unwind: {
      path: "$user",
      preserveNullAndEmptyArrays: true   // keep orders with no user match
    }
  }
])
// { ..., user: { name: "Alice", ... } }  ← flat object now
WARN
Without preserveNullAndEmptyArrays: true, any document with an empty as array (no foreign match) is silently dropped — converting your LEFT JOIN into an INNER JOIN. Always include it when unwinding a $lookup result.
05
Indexing Rules
Mandatory for performance — one full scan per document without index
performance

$lookup performs one lookup per document entering the stage. Without an index on the foreign field, MongoDB performs a full collection scan of the foreign collection for every local document — catastrophic at scale.

// Create index on the foreignField in the target collection BEFORE running $lookup
db.orders.createIndex({ customerId: 1 })

// Verify index usage with explain
db.users.aggregate([
  { $lookup: { from:"orders", localField:"_id", foreignField:"customerId", as:"orders" } }
]).explain("executionStats")
// Look for IXSCAN on orders collection, not COLLSCAN
ScenarioPerformance
Index on foreignFieldIXSCAN — O(log n) per lookup
No index on foreignFieldCOLLSCAN — O(n) per document = O(n²) total
_id as foreignFieldAlways indexed automatically
DANGER
10,000 documents passing through a $lookup against a 1M-document collection with no index on foreignField = 10 billion document scans. Always index the foreign field first.
06
Edge Cases
Null localField · 16MB limit · sharding · cross-database
gotchas
ScenarioBehaviorFix
No index on foreignField Full scan per document — O(n²) at scale Create index on foreignField in foreign collection
localField is null or missing Matches documents where foreignField is null — unexpected joins $match: { localField: { $exists:true, $ne:null } } before $lookup
Result document exceeds 16MB Pipeline fails with BSON size error Use advanced pipeline syntax with $limit to cap joined array
from collection is sharded Error — not supported Source collection can be sharded; from target cannot be
Cross-database join Not supported natively Both collections must be in the same database
Zero matches in foreign collection as = empty array [] Use $unwind with preserveNullAndEmptyArrays:true to keep these docs

Null localField Guard Pattern

// Pre-filter null/missing localField before $lookup to avoid ghost matches
db.orders.aggregate([
  { $match: { customerId: { $exists: true, $ne: null } } },
  {
    $lookup: {
      from: "customers",
      localField: "customerId",
      foreignField: "_id",
      as: "customer"
    }
  }
])
07
Full Example
Monthly report — group by month, join customer name
example
// Monthly sales report: group by month, attach top customer name
// orders: { _id, customerId, amount, status, createdAt }
// customers: { _id, name, email }

db.orders.aggregate([

  // 1. Filter: 2024 completed orders only — uses index on status+createdAt
  {
    $match: {
      status: "completed",
      createdAt: { $gte: ISODate("2024-01-01"), $lt: ISODate("2025-01-01") }
    }
  },

  // 2. Sort so $first picks the highest-amount order per month
  { $sort: { amount: -1 } },

  // 3. Group by year-month
  {
    $group: {
      _id:           { year: { $year: "$createdAt" }, month: { $month: "$createdAt" } },
      totalRevenue:  { $sum: "$amount" },
      orderCount:    { $sum: 1 },
      topCustomerId: { $first: "$customerId" },
      topAmount:     { $max: "$amount" }
    }
  },

  // 4. Join to get top customer's name — customers._id must be indexed
  {
    $lookup: {
      from:         "customers",
      localField:   "topCustomerId",
      foreignField: "_id",
      as:           "topCustomer"
    }
  },

  // 5. Flatten 1-to-1 array — preserveNullAndEmptyArrays keeps months with no customer match
  {
    $unwind: {
      path: "$topCustomer",
      preserveNullAndEmptyArrays: true
    }
  },

  // 6. Sort report chronologically
  { $sort: { "_id.year": 1, "_id.month": 1 } },

  // 7. Shape final output
  {
    $project: {
      _id: 0,
      year:  "$_id.year",
      month: "$_id.month",
      totalRevenue: 1,
      orderCount:   1,
      topOrder: {
        amount:   "$topAmount",
        customer: "$topCustomer.name"
      }
    }
  }

], { allowDiskUse: true })