Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I think you are struggling with the separation of domain/object modeling from database schema modeling. I too struggled with this when trying out MongoDb.</p> <p>For the sake of semantics and clarity, I'm going to substitute <code>Groups</code> with the word <code>Categories</code></p> <p>Essentially your theoretical model is a "many to many" relationship in that each <code>Item</code> can belong <code>Categories</code>, and each <code>Category</code> can then possess many <code>Items</code>. </p> <p>This is best handled in your domain object modeling, not in DB schema, especially when implementing a document database (NoSQL). In your MongoDb schema you "fake" a "many to many" relationship, by using a combination of top-level document models, and embedding. </p> <p>Embedding is hard to swallow for folks coming from SQL persistence back-ends, but it <strong><em>is</em></strong> an essential part of the answer. The trick is deciding whether or not it is shallow or deep, one-way or two-way, etc.</p> <hr> <p><strong>Top Level Document Models</strong></p> <p>Because your <code>Category</code> documents contain some data of their own and are heavily referenced by a vast number of <code>Items</code>, I agree with you that fully embedding them inside each <code>Item</code> is unwise.</p> <p>Instead, treat both <code>Item</code> and <code>Category</code> objects as top-level documents. Ensure that your MongoDb schema allots a table for each one so that each document has its own <code>ObjectId</code>.</p> <p>The next step is to decide where and how much to embed... there is no right answer as it all depends on how you use it and what your scaling ambitions are... </p> <p><strong>Embedding Decisions</strong></p> <p><em>1. Items</em></p> <p>At minimum, your <code>Item</code> objects should have a collection property for its categories. At the very least this collection should contain the <code>ObjectId</code> for each <code>Category</code>.</p> <p>My suggestion would be to add to this collection, the data you use when interacting with the <code>Item</code> most often...</p> <p>For example, if I want to list a bunch of items on my web page in a grid, and show the names of the categories they are part of. It is obvious that I don't need to know everything about the <code>Category</code>, but if I only have the ObjectId embedded, a second query would be necessary to get any detail about it at all.</p> <p>Instead what would make most sense is to embed the Category's <code>Name</code> property in the collection along with the <code>ObjectId</code>, so that pulling back an <code>Item</code> can now display its category names without another query.</p> <p>The biggest thing to remember is that the key/value objects embedded in your <code>Item</code> that "represent" a <code>Category</code> do not have to match the real <code>Category</code> document model... It is not OOP or relational database modeling.</p> <p><em>2. Categories</em></p> <p>In reverse you might choose to leave embedding one-way, and not have any <code>Item</code> info in your <code>Category</code> documents... or you might choose to add a collection for Item data much like above (<code>ObjectId</code>, or <code>ObjectId</code> + <code>Name</code>)... </p> <p>In this direction, I would personally lean toward having nothing embedded... more than likely if I want <code>Item</code> information for my category, i want lots of it, more than just a name... and deep-embedding a top-level document (Item) makes no sense. I would simply resign myself to querying the database for an Items collection where each one possesed the ObjectId of my Category in its collection of Categories.</p> <p>Phew... confusing for sure. The point is, you <em>will</em> have some data duplication and you <em>will</em> have to tweak your models to your usage for best performance. The good news is that that is what MongoDb and other document databases are good at... </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload