Note that there are some explanatory texts on larger screens.

plurals
  1. POLong (and failing) bulk data loads to Google App Engine datastore
    text
    copied!<p>I'm developing an application on Google App Engine using the current django non-rel and the now default, high replication datastore. I'm currently trying to bulk load a 180MB csv file locally on a dev instance with the following command:</p> <pre><code>appcfg.py upload_data --config_file=bulkloader.yaml --filename=../my_data.csv --kind=Place --num_threads=4 --url=http://localhost:8000/_ah/remote_api --rps_limit=500 </code></pre> <p><strong>bulkloader.yaml</strong></p> <pre><code>python_preamble: - import: base64 - import: re - import: google.appengine.ext.bulkload.transform - import: google.appengine.ext.bulkload.bulkloader_wizard - import: google.appengine.ext.db - import: google.appengine.api.datastore - import: google.appengine.api.users transformers: - kind: Place connector: csv connector_options: encoding: utf-8 columns: from_header property_map: - property: __key__ external_name: appengine_key export_transform: transform.key_id_or_name_as_string - property: name external_name: name </code></pre> <p>The bulk load is actually successful for a truncated, 1000 record version of the CSV, but the full set eventually bogs down and starts erroring, "backing off" and waiting longer and longer. The bulkloader-log that I actually tail, doesn't reveal anything helpful and either does the server's stderr.</p> <p>Any help in understanding this bulk load process would be appreciated. My plans are to be able to eventually load big data sets into the google data store, but this isn't promising. </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload