Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This idea uses the ELB's capability to detect an unhealthy node and remove it from the pool BUT it relies upon the ELB behaving as expected in the assumptions below. This is something I've been meaning to test for myself but haven't had the time yet. I'll update the answer when I do.</p> <p><strong>Process Overview</strong></p> <p>The following logic could be wrapped and run at the time the node needs to be shut down.</p> <ol> <li>Block new HTTP connections to nodeX but continue to allow existing connections</li> <li>Wait for existing connections to drain, either by monitoring existing connections to your application or by allowing a "safe" amount of time.</li> <li>Initiate a shutdown on the nodeX EC2 instance using the EC2 API directly or Abstracted scripts.</li> </ol> <p>"safe" according to your application, which may not be possible to determine for some applications.</p> <p><strong>Assumptions that need to be tested</strong></p> <p>We know that ELB <a href="http://aws.amazon.com/elasticloadbalancing/" rel="noreferrer">removes unhealthy instances from it's pool</a> I would expect this to be graceful, so that:</p> <ol> <li>A new connection to a recently closed port will be gracefully redirected to the next node in the pool</li> <li>When a node is marked Bad, the already established connections to that node are unaffected.</li> </ol> <p><strong>possible test cases:</strong> </p> <ul> <li>Fire HTTP connections at ELB (E.g. from a curl script) logging the results during scripted opening an closing of one of the nodes HTTP ports. You would need to experiment to find an acceptable amount of time that allows ELB to always determine a state change.</li> <li>Maintain a long HTTP session, (E.g. file download) while blocking new HTTP connections, the long session should hopefully continue.</li> </ul> <p><strong>1. How to block HTTP Connections</strong></p> <p>Use a local firewall on nodeX to block new sessions but continue to allow established sessions.</p> <p>For example IP tables:</p> <pre><code>iptables -A INPUT -j DROP -p tcp --syn --destination-port &lt;web service port&gt; </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. COThanks for the ideas! Unfortunately, assumption number 2 seems to be the important one that is being missing. As far as I know, a node exists about 40-60 seconds after being detected as ill, with no guarantee. But sadly, it is already being removed immediately without any warning from the ELB and any existing connections are terminated and not forwarded to another node. This is what I know, but I could try to experiment with it...
      singulars
    2. COIt's good that it detects it as down and removes it, that's what we want. But also removing the existing connections would certainly give us problems, I wouldn't rule this out without a test because I've seen other loadbalancing software work this way... Otherwise, are you able to use sub domains with the load balancer so that it only establishes the initial connection? E.g. balance.domain.com diverts to nodeX.domain.com? Where nodeX is the next one in a round-robin pool etc.
      singulars
    3. COELB itself doesn't support using sub-domains - but a machine could know about its own name. I could even have a set of machines mapped to domain names via dns entries - don't know how to do it automatically though. Since I pay most of the money to instances running, and paused instances are pretty cheap, this may be an option. So I'd use the ELB for the initial distribution, and from then on maybe use the node a user has been assigned to. This may work! Any idea on how to best use subdomains instead of AWS machine urls? (I want to use wildcard-ssl for a single domain).
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload