Best Practices

Good Rules

  • Favor embedding unless there is a compelling reason not to.
  • Needing to access an object on its own is a compelling reason not to embed it.
  • Avoid joins and lookups if possible, but don't be afraid if they can provide a better schema design.
  • Arrays should not grow without bound.
    • If there are more than a couple of hundred documents on the "many" side, don't embed them
    • If there are more than a few thousand documents on the "many" side, don't use an array of ObjectID references.
    • High-cardinality arrays are a compelling reason not to embed.
  • How you model your data depends entirely on your particular application's data access patterns. You want to structure your data to match the ways that your application queries and updates it.

Guidelines based on Relationships

  • One-to-One - Prefer key value pairs within the document
  • One-to-Few - Prefer embedding
  • One-to-Many - Prefer embedding
  • One-to-Squillions - Prefer Referencing
  • Many-to-Many - Prefer Referencing

Relational vs Document-Based DB

There's an inherent difference between the two approaches. Relational DB schema encourages normalization of your data to reduce duplication. As a result, you create multiple tables and use joins to query data from different related tables.

Document-based schema designs encourages a fluid schema, one that works for your specific app requirements. If you want to query some related data regularly, you are encouraged to embed all the related data into a single document, instead of relying on lookup.