Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>So as you pointed out in your question, there are two approaches that I can see to your issue:</p> <ol> <li>Use an external service to render and host the images.</li> <li>Render and host the images yourself.</li> </ol> <p>I'm no expert in field, but my Googling has so far only returned services that allow you to generate thumbnails and not full-size screenshots (like the few mentioned <a href="https://stackoverflow.com/questions/947642/create-a-website-screenshot-thumbnail-server-side">here</a>). If there are hosted services out there that will do this for you, I wasn't able to find them easily.</p> <p>So, that leaves #2. For this, my first instinct was to look for a ruby library that could generate an image from a webpage, which quickly led me to <a href="https://github.com/csquared/IMGKit" rel="nofollow noreferrer">IMGKit</a> (there may be others, but this one looked clean and simple). With this library, you can easily pass in a URL and it will use the webkit engine to generate a screenshot of the page for you. From there, I would save it to wherever your assets are stored (like <a href="https://github.com/marcel/aws-s3" rel="nofollow noreferrer">Amazon S3</a>) using a file attachment gem like <a href="https://github.com/thoughtbot/paperclip" rel="nofollow noreferrer">Paperclip</a> or <a href="https://github.com/jnicklas/carrierwave" rel="nofollow noreferrer">CarrierWave</a> (<a href="http://railscasts.com/episodes/253-carrierwave-file-uploads" rel="nofollow noreferrer">railscast</a>). Store your attachment with a field recording the original URL you passed to IMGKit from WSAPI (Web Search API) so that you can compare against it on subsequent searches and use the cached version instead of re-rendering the preview. You can also use the <code>created_at</code> field for your attachment model to throw in some "if older than x days, refresh the image" type logic. Lastly, I'd put this all in a background job using something like <a href="https://github.com/defunkt/resque" rel="nofollow noreferrer">resque</a> (<a href="http://railscasts.com/episodes/271-resque" rel="nofollow noreferrer">railscast</a>) so that the user isn't blocked when waiting for screenshots to render. Pass the array of returned URLs from WSAPI to background workers in resque that will generate the images via IMGKit--saving them to S3 via paperclip/carrierwave, basically. All of these projects are well-documented, and the Railscasts will walk you through the basics of the resque and carrierwave gems.</p> <p>I haven't crunched the numbers, but you can against hosting the images yourself on <a href="http://aws.amazon.com/s3/#pricing" rel="nofollow noreferrer">S3</a> versus any other external provider of web thumbnail generation. Of course, doing it yourself gives you full control over how the image looks (quality, format, etc.), whereas most of the services I've come across only offer a small thumbnail, so there's something to be said for that. If you don't cache the images from previous searches, then your costs reduces even further, since you'll always be rendering the images on the fly. However I suspect that this won't scale very well, as you may end up paying a lot more for server power (for IMGKit and image processing) and bandwidth (for external requests to fetch the source HTML for IMGKit). I'd be sure to include some <a href="http://newrelic.com/" rel="nofollow noreferrer">metrics</a> in your project to attach some exact numbers to the kind of requests you're dealing with to help determine what the subsequent costs would be.</p> <p>Anywho, that would be my high-level approach. I hope it helps some.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload