Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Couple of points:</p> <h3> You MUST MUST MUST back up your Amazon EBS volume.</h3> <p>They claim "better" reliability, but not 100%, and it's SEVERAL orders of magnitude off of S3's "12 9's" of durability. S3 durability >> EBS durability. That's a fact. EBS supports a "snapshots" feature which backs up your storage efficiently and incrementally to S3. Also, with EBS snapshots, you only pay for the compressed deltas, which is typically far far less than the allocated volume size. In another life, I've sent lost-volume emails to smaller customers like you who "thought" that EBS was "durable" and trusted it with the only copy of a mission-critical database... it's heartbreaking.</p> <h3> Your Q: automating start-up of a new instance </h3> <p>The design path you mention is relatively untraveled; here's why... Lots of companies run redundant "hot-spare" instances where the second instance is booted and running. This allows rapid failover (seconds) in the event of "failure" (could be hardware or software). The issue with a "cold-spare" is that it's harder to keep the machine up to date and ready to pick up where the old box left off. More important, it's tricky to VALIDATE that the spare is capable of successfully recovering your production service. Hardware is more reliable than untested software systems. TEST TEST TEST. If you haven't tested your fail-over, it doesn't work.</p> <p>The simple automation of starting a new EBS instance is easy, bordering on trivial. It's just a one-line bash script calling <a href="http://aws.amazon.com/developertools/351" rel="nofollow">the EC2 command-line tools</a>. What's tricky is everything on top of that. Such a solution pretty much implies a fully 100% automated deployment process. And this is all specific to your application. Can your app pull down all the data it needs to run (maybe it's stored in S3?). Can you kill you instance today and boot a new instance with 0.000 manual setup/install steps?</p> <p>Or, you may be talking about a scenario I'll call <strong><em>"re-instancing an EBS volume"</em></strong>:</p> <ol> <li>EC2 box dies (root volume is EBS)</li> <li>Force detach EBS volume</li> <li>Boot new EC2 instance with the EBS volume</li> </ol> <p>... That <strong><em>mostly</em></strong> works. The gotchas:</p> <ul> <li>Doesn't protect against EBS failures, either total volume loss or an availability loss</li> <li>Recovery time is O(minutes) assuming everything works just right</li> <li>Your services need to be configured to restart automatically. It does no good to bring the box back if Nginx isn't running.</li> <li>Your DNS routes or other services or whatever need to be ok with the IP-address changing. This can be worked around with ElasticIP.</li> <li>How are your host SSH keys handled? Same name, new host key can break SSH-based automation when it gets the strong-warning for host-key-changed.</li> <li>I don't have proof of this (other than seeing it happen once), but I believe that EC2/EBS _already_does_this_ automatically for boot-from-EBS instances</li> </ul> <p>Again, the hard part here is on your plate. Can you stop your production service today and bring it up RELIABLY on a new instance? If so, the EC2 part of the story is <a href="http://aws.amazon.com/developertools/351" rel="nofollow">really really easy</a>.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload