Note that there are some explanatory texts on larger screens.

plurals
  1. POAmazon EC2 On-Demand Workers for Short Tasks
    primarykey
    data
    text
    <p>I am looking to build a web application which needs to run resource-intensive MCMC (<a href="http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo" rel="noreferrer">Markov chain Monte Carlo</a>) calculations on-demand in R to generate some probability graphs for the user.</p> <p>Constraints:</p> <ol> <li><p>Obviously I don't want to run the resource-intensive calculations on the same server as the web app front-end, so these tasks need to be handed off to a <strong>worker instance</strong>.</p></li> <li><p>These calculations take a good amount of CPU to run and I'd like to keep latency as low as possible (hopefully seconds, not minutes), so I would prefer to run the calculations on <strong>beefier hardware</strong>.</p></li> <li><p>I cannot afford to run a beefy EC2 instance at ~66¢/hr x 24hrs/day, so <strong>on-demand</strong> or spot request instances are probably necessary.</p></li> </ol> <p>Here are the options I've come up with:</p> <ol> <li><p>Run a cheap, affordable worker instance 24hrs a day which takes one task at a time managed by Amazon SWF (or SQS).<br> <br /> <strong>Cons:</strong></p> <ul> <li><em>high latency</em> - Cheaper hardware, longer wait times.<br> <br /><br> <br /></li> </ul></li> <li><p>Spawn a beefier worker instance per-task (spun up whenever a job is added to the queue) and terminate the instance upon completion.<br> <br /> <strong>Cons:</strong></p> <ul> <li><em>expensive/wasteful</em> - I'd be paying for an hour on the server each time and only using seconds for my calculation<br> <br /> </li> <li><em>startup overhead</em> - Would spinning up a new EC2 instance on-demand introduce non-negligible latency (offsetting the whole purpose of utilizing beefier hardware)?<br> <br /><br> <br /></li> </ul></li> <li><p>Like #2 but with low-bid EC2 spot requests.<br> <br /> <strong>Cons:</strong> </p> <ul> <li><em>startup overhead</em> - See #2<br> <br /></li> <li><em>inconsistancy?</em> - I've never worked with spot requests before, so I have no idea how volatile or hands-on such a solution would be... do I have to continually adjust my bids to make sure I can still get tasks done at peak hours? Also, I suppose I'd have to monitor my processes closely to make sure they aren't interrupted mid-calculation.<br> <br /><br> <br /></li> </ul></li> <li><p>Some kind of hybrid solution where I actively monitor beefy-hardware worker instances and their loads and intelligently spin up and terminate instances on the hour to maintain an optimal balance of cost and availability<br> <br /> <strong>Cons:</strong></p> <ul> <li><em>complicated and costly setup</em> - Unless there's a good managed service out there to handle stuff like this, I'd have to set all all of that infrastructure up myself...<br> <br /> </li> </ul></li> </ol> <p>I wish there was some service where I could pay for a highly-available on-demand hardware on a minute to minute basis rather than hourly.<br> <br /></p> <p>So my questions are the following:</p> <ul> <li><p>How would you recommend solving this problem?</p></li> <li><p>Is there a good EC2 instance managing solution that could sit on top of Amazon SWF and help me load balance and terminate idle workers?</p></li> <li><p>Would spot-request bids solve my problem or are they more suited to tasks which don't necessarily need to be completed right away?</p></li> </ul>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload