Note that there are some explanatory texts on larger screens.

plurals
  1. POMongoDB - Use aggregation framework or mapreduce for matching array of strings within documents (profile matching)
    text
    copied!<p>I'm building an application that could be likened to a dating application.</p> <p>I've got some documents with a structure like this:</p> <blockquote> <p>$ db.profiles.find().pretty()</p> </blockquote> <pre><code>[ { "_id": 1, "firstName": "John", "lastName": "Smith", "fieldValues": [ "favouriteColour|red", "food|pizza", "food|chinese" ] }, { "_id": 2, "firstName": "Sarah", "lastName": "Jane", "fieldValues": [ "favouriteColour|blue", "food|pizza", "food|mexican", "pets|yes" ] }, { "_id": 3, "firstName": "Rachel", "lastName": "Jones", "fieldValues": [ "food|pizza" ] } ] </code></pre> <p>What I'm trying to so is identify profiles that match each other on one or more <code>fieldValues</code>. </p> <p>So, in the example above, my ideal result would look something like:</p> <pre><code>&lt;some query&gt; result: [ { "_id": "507f1f77bcf86cd799439011", "dateCreated": "2013-12-01", "profiles": [ { "_id": 1, "firstName": "John", "lastName": "Smith", "fieldValues": [ "favouriteColour|red", "food|pizza", "food|chinese" ] }, { "_id": 2, "firstName": "Sarah", "lastName": "Jane", "fieldValues": [ "favouriteColour|blue", "food|pizza", "food|mexican", "pets|yes" ] }, ] }, { "_id": "356g1dgk5cf86cd737858595", "dateCreated": "2013-12-02", "profiles": [ { "_id": 1, "firstName": "John", "lastName": "Smith", "fieldValues": [ "favouriteColour|red", "food|pizza", "food|chinese" ] }, { "_id": 3, "firstName": "Rachel", "lastName": "Jones", "fieldValues": [ "food|pizza" ] } ] } ] </code></pre> <p>I've thought about doing this either as a map reduce, or with the aggregation framework.</p> <p>Either way, the 'result' would be persisted to a collection (as per the 'results' above)</p> <p>My question is which of the two would be more suited? And where would I start to implement this?</p> <p><strong>Edit</strong></p> <p>In a nutshell, the model can't easily be changed.<br> This isn't like a 'profile' in the traditional sense.</p> <p>What I'm basically looking to do (in psuedo code) is along the lines of:</p> <pre><code>foreach profile in db.profiles.find() foreach otherProfile in db.profiles.find("_id": {$ne: profile._id}) if profile.fieldValues matches any otherProfie.fieldValues //it's a match! </code></pre> <p>Obviously that kind of operation is very very slow!</p> <p>It may also be worth mentioning that this data is never displayed, it's literally just a string value that's used for 'matching'</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload