Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing S3 image file access for image processing, within a Django app
    primarykey
    data
    text
    <p>For the django apps I typically build, S3 is a no-brainer for storing any non-trivial static data... most notably images. It makes page loads much faster than I could otherwise. I use the S3BotoStorage filesystem backend in the <a href="https://bitbucket.org/david/django-storages/wiki/Home" rel="nofollow">django-storages package</a> and I have found it to be fantastically transparent and hassle-free w/r/t implementation.</p> <p>Not so much w/r/t operation, though: now, I'm building out a small family of apps, which all depend on a Django-centric image-processing platform. Most of the processor-bound operations I'm doing can either be handled within an HTTP request lifecycle; for the few processes that are more demanding, I use an async signal queue and a RESTful API to defuse potential bottlenecks through timing and UI considerations.</p> <p>That's all great, when working with image data local to the processing app. S3 throws a monkey wrench into it by making all file-object operations totally nondeterministic. The problem isn't failures (I get a random IOError or somesuch from inside the django-storages app maybe once a week), but the time it takes to access files, and the total lack of any sort of filesystem cache.</p> <p>I've done a bit of refactoring to support S3 -- scrubbing all absolute paths out of the codebase; implementing retries and workarounds for uncoöperative Boto requests -- my impetus for building out the signal queue, in fact, was in order to mitigate the S3 file-access overhead (the details of which I will spare you). The point is that if I'm supporting S3, I'd like to support it in the most awesome/productive way possible.</p> <p>Naturally, I don't want to screw things up or complicate things further by putting in the kind of caching layer that will take babysitting -- I'm after a straightforward (and preferably uncomplicated) way to speed up the file-object operations I'm performing on S3 images. For example, if I read from a given file object several times within a reasonable timeframe, it'd be great if the subsequent reads were cached enough so that each read didn't have to fetch the file anew from S3.</p> <p>Does anyone have a module recommendation, a sample implementation, a configuration tactic, or any combo of the above, with which I might address my S3 file-op woes?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload