Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing Pymongo read queries
    text
    copied!<p>I have created a huge mongodb database. Following are its stats:</p> <pre><code>&gt; db.stats() &lt;br/&gt; { "db" : "test-sample-db", "collections" : 3, "objects" : 30700242, "avgObjSize" : 607.1807849592847, "dataSize" : 18640597036, "storageSize" : 19531558816, "numExtents" : 31, "indexes" : 2, "indexSize" : 2692111520, "fileSize" : 25691160576, "nsSizeMB" : 16, "dataFileVersion" : { "major" : 4, "minor" : 5 }, "ok" : 1 } </code></pre> <p>I am using pymongo on Windows 7 (64 bit) (8 GB RAM) python 2.7 to query this database. Note that index is created on a field called "key". Query is simple, I just want all the documents having specified keys. I used $in for that as follows:</p> <pre><code>result = testdb.find({"key": {"$in":lt}}) for doc in result: pass </code></pre> <p>Size of "lt" above is apprx 1000. Hence this query will at amx return 1000 documents. What I noticed was that this query is quite slow. It takes about 5-6 seconds to perform this query and iterate over the cursor as shown in the above code. Is there any way I can optimize this to make this operation much faster ?</p> <p>Sample document:</p> <pre><code>{"key" : "abcd12xx", "data" : {"w1" : 1, "w3": 1, "w4" : 3}} </code></pre> <p>Output for query.explain() (Not that this for just 10 keys for saving space here. Normally I will have 1000 keys and hence this will be larger):</p> <pre><code>{ u'nYields': 0, u'nscannedAllPlans': 19, u'allPlans': [{u'cursor': u'BtreeCursor feature_1 multi', u'indexBounds': {u'feature': [[u'1000', u'1000'], [u'1001', u'1001'], [u'1002', u'1002'], [u'1003', u'1003'], [u'1004', u'1004'], [u'1005', u'1005'], [u'1006', u'1006'], [u'1007', u'1007'], [u'1008', u'1008'], [u'1009', u'1009']] }, u'nscannedObjects': 10, u'nscanned': 19, u'n': 10}], u'millis': 0, u'nChunkSkips': 0, u'server': u'server:27017', u'n': 10, u'cursor': u'BtreeCursor feature_1 multi' } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload