Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to prevent ndb from batching a put_async() call and make it issue the RPC immediately?
    text
    copied!<p> I have a request handler that updates an entity, saves it to the datastore, then needs to perform some additional work before returning (like queuing a background task and json-serializing some results). I want to parallelize this code, so that the additional work is done while the entity is being saved.</p> <p>Here's what my handler code boils down to:</p> <pre class="lang-py prettyprint-override"><code>class FooHandler(webapp2.RequestHandler): @ndb.toplevel def post(self): foo = yield Foo.get_by_id_async(some_id) # Do some work with foo # Don't yield, as I want to perform the code that follows # while foo is being saved to the datastore. # I'm in a toplevel, so the handler will not exit as long as # this async request is not finished. foo.put_async() taskqueue.add(...) json_result = generate_result() self.response.headers["Content-Type"] = "application/json; charset=UTF-8" self.response.write(json_result) </code></pre> <p>However, Appstats shows that the <code>datastore.Put</code> RPC is being done serially, after <code>taskqueue.Add</code>:</p> <p><img src="https://i.stack.imgur.com/vbvX4.png" alt="Appstats screenshot"></p> <p>A little digging around in <code>ndb.context.py</code> shows that a <code>put_async()</code> call ends up being added to an <code>AutoBatcher</code> instead of the RPC being issued immediately.</p> <p>So I presume that the <code>_put_batcher</code> ends up being flushed when the <code>toplevel</code> waits for all async calls to be complete.</p> <p>I understand that batching puts has real benefits in certain scenarios, but in my case here I really want the put RPC to be sent immediately, so I can perform other work while the entity is being saved.</p> <p>If I do <code>yield foo.put_async()</code>, then I get the same waterfall in Appstats, but with <code>datastore.Put</code> being done before the rest:</p> <p><img src="https://i.stack.imgur.com/1MpaY.png" alt="2nd Appstats screenshot"></p> <p>This is to be expected, as <code>yield</code> makes my handler wait for the <code>put_async()</code> call to complete before executing the rest of the code.</p> <p>I also have tried adding a call to <code>ndb.get_context().flush()</code> right after <code>foo.put_async()</code>, but the <code>datastore.Put</code> and <code>taskqueue.BulkAdd</code> calls are still not being made in parallel according to Appstats.</p> <p>So my question is: how can I force the call to <code>put_async()</code> to bypass the auto batcher and issue the RPC immediately?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload