Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Do you actually have the following three indexes defined on your collection?</p> <pre><code>db.markers.ensureIndex({ latlng: '2d', _id:-1 }) db.markers.ensureIndex({ latlng: '2d' }) db.markers.ensureIndex({ _id: -1 }) </code></pre> <p>The <a href="http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-CreatingtheIndex">geospatial indexing</a> docs advise against creating multiple geo indexes on the same collection. Although MongoDB will allow it, the behavior may be undesirable. My guess for your case is that the non-compound <code>{latlng: '2d'}</code> may have been selected for use instead of the compound index. The <code>explain()</code> output doesn't really help us here, since it simply reports <code>GeoBrowse-box</code> instead of the index name; however, I would suggest manually <a href="http://www.mongodb.org/display/DOCS/Optimization#Optimization-Hint">hinting</a> that the cursor use the compound index and see if the results improve. Alternatively, simply get rid of the non-compound index, so <code>{latlng: '2d', _id:-1}</code> because the obvious and only choice for the query optimizer.</p> <p>Lastly, the <code>{_id: -1}</code> index is redundant and can be removed. Per the <a href="http://www.mongodb.org/display/DOCS/Indexes#Indexes-CompoundKeys">compound index</a> documentation, direction is only relevant when dealing with indexes comprised of multiple fields. For a single-key index, we can walk the index backwards or forwards easily enough. Since MongoDB already creates an <code>{_id: 1}</code> index for us by default, it's more efficient to simply rely on that.</p> <p>Now, with indexing out of the way: one caveat with your query is that limits are applied to the geospatial query component before sorting by non-geo criteria (<code>_id</code> in your case). I believe this means that, while your results will indeed be sorted by <code>_id</code>, that sort may not be considering all documents within the matched bounds. This is mentioned in the <a href="http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-CompoundIndexes">compound index</a> bit of the documentation, which references <a href="https://jira.mongodb.org/browse/SERVER-4247">SERVER-4247</a> as a pending solution.</p> <hr> <p><strong>Edit</strong>: Following up with your benchmark</p> <p>I populated the example data, which are 260k random points between ±90 and ±180. I then ran your query:</p> <pre><code>db.markers.find( { latlng: { $within: { $box: [[-90, -180], [90, 180]] }}}, { latlng: 1, _id: 1 } ).sort({_id: -1}).limit(1000).explain() </code></pre> <p>That took 1713ms (I'll use that as a baseline of comparison instead of your time of 2351ms). I'll also note that the query matched all 260k documents, and scanned the same number of index entries. It appears the limit didn't factor in until the <code>_id</code> sort, which is not what I would have expected based on the note <a href="http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-CompoundIndexes">here</a>. I then tweaked the query a bit to examine some other cases:</p> <ul> <li>Original query without the <code>_id</code> sort and limit: <code>nscanned</code> is 260k and time is 1470ms.</li> <li>Original query without the <code>_id</code> sort: <code>nscanned</code> is 1000 and time is 9ms.</li> <li>Original query without the limit: <code>nscanned</code> is 260k and time is 2567ms.</li> </ul> <p>I also wanted to test sorting on an unindexed field alone to simulate what might happen for the <code>_id</code> sort after a geo match; however, I couldn't use <code>_id</code> since the default index will always exist. To do this, I deleted the compound geo index and then sorted by the <code>latlng</code> object. This resulted in <code>nscanned</code> of 260k and a time of 1039ms. If I add a limit of 1000, the time was 461ms.</p> <p>If we add that to the 1470ms above (geo query without a sort and limit), it's very close to the original query without a limit, which was 2567ms. Likewise, if we add 461ms (<a href="http://en.wikipedia.org/wiki/Selection_algorithm#Selecting_k_smallest_or_largest_elements">limited sort</a>) to 1470ms, it's near the original benchmark result of 1713ms. Based on that correlation, I'd wager that the <code>_id</code> sort in your benchmark isn't taking advantage of the compound index at all.</p> <p>In any event, one other reason the benchmark is slow is due to a very wide geo match. Tighter bounds would definitely result in less data to sort, even with that sort being unindexed. That said, I do think <a href="https://jira.mongodb.org/browse/SERVER-4247">SERVER-4247</a> would help you, since it would likely process the non-geo sort first before performing the geo match.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload