Note that there are some explanatory texts on larger screens.

plurals
  1. POGoogle App Engine: How to write large files to Google Cloud Storage
    primarykey
    data
    text
    <p>I am trying to save large files from Google App Engine's Blobstore to Google Cloud Storage to facilitate backup.</p> <p>It works fine for small files (&lt;10 mb) but for larger files it get gets unstable and GAE throws and FileNotOpenedError. </p> <p>My code:</p> <pre><code>PATH = '/gs/backupbucket/' for df in DocumentFile.all(): fn = df.blob.filename br = blobstore.BlobReader(df.blob) write_path = files.gs.create(self.PATH+fn.encode('utf-8'), mime_type='application/zip',acl='project-private') with files.open(write_path, 'a') as fp: while True: buf = br.read(100000) if buf=="": break fp.write(buf) files.finalize(write_path) </code></pre> <p>(Runs in a taskeque to avoid exceeding execution time).</p> <p>Throws a FileNotOpenedError:</p> <pre> Traceback (most recent call last): File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__ rv = self.handle_exception(request, response, e) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__ rv = self.router.dispatch(request, response) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher return route.handler_adapter(request, response) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__ return handler.dispatch() File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch return self.handle_exception(e, self.app.debug) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch return method(*args, **kwargs) File "/base/data/home/apps/s~simplerepository/1.354754771592783168/processFiles.py", line 249, in post fp.write(buf) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 281, in __exit__ self.close() File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 275, in close self._make_rpc_call_with_retry('Close', request, response) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 388, in _make_rpc_call_with_retry _make_call(method, request, response) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 236, in _make_call _raise_app_error(e) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 179, in _raise_app_error raise FileNotOpenedError() </pre> <p>I have investigated further and according to a comment to <a href="http://code.google.com/p/googleappengine/issues/detail?id=5731" rel="nofollow">GAE Issue 5371</a> the Files API closes the file every 30 seconds. I have not seen this documented anywhere else. </p> <p>I have tried to work around this by closing and opening the file at intervals but now I get an WrongOpenModeError. The code below is edited from the first version of this post I have added a 0.5 second pause between the close and the open of the file. It now throws a WrongOpenModeError.</p> <p>My code (updated):</p> <pre><code>PATH = '/gs/backupbucket/' for df in DocumentFile.all(): fn = df.blob.filename br = blobstore.BlobReader(df.blob) write_path = files.gs.create(self.PATH+fn.encode('utf-8'), mime_type='application/zip',acl='project-private') fp = files.open(write_path, 'a') c = 0 while True: if (c == 5): c = 0 fp.close() files.finalize(write_path) time.sleep(0.5) fp = files.open(write_path, 'a') c = c + 1 buf = br.read(100000) if buf=="": break fp.write(buf) files.finalize(write_path) </code></pre> <p>Stacktrace:</p> <pre> Traceback (most recent call last): File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__ rv = self.handle_exception(request, response, e) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__ rv = self.router.dispatch(request, response) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher return route.handler_adapter(request, response) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__ return handler.dispatch() File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch return self.handle_exception(e, self.app.debug) File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch return method(*args, **kwargs) File "/base/data/home/apps/s~simplerepository/1.354894420907462278/processFiles.py", line 267, in get fp.write(buf) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 310, in write self._make_rpc_call_with_retry('Append', request, response) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 388, in _make_rpc_call_with_retry _make_call(method, request, response) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 236, in _make_call _raise_app_error(e) File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 188, in _raise_app_error raise WrongOpenModeError() </pre> <p>I have tried to find information about the WrongOpenModeError but the only place it is mentioned is in the appengine.api.files.file.py itself.</p> <p>Suggestions on how to get around this and be able to save also large files to Google Cloud storage would be greatly appreciated. Thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload