Note that there are some explanatory texts on larger screens.

plurals
  1. POErlang: Cannot start slave - {error,timeout}
    text
    copied!<p>I'm currently trying to set up a distributed Tsung load testing environment which uses the Erlang slave functionality, however I have been unsuccessful in getting the controller node to start a slave node. E.g.</p> <pre><code>(musicglue@load1)1&gt; net:ping(musicglue@load2). pong (musicglue@load1)2&gt; slave:start(load2,musicglue,"-setcookie tom"). {error,timeout} </code></pre> <h2>BACKGROUND</h2> <p>My env:</p> <p>Controller - hostname: load1, user: musicglue, Ubuntu 10.04 LTS, Erlang R15B01 compiled from source Slave - hostname: load2, user: musicglue, Ubuntu 10.04 LTS, Erlang R15B01 complied from source Firewall disabled SELinux not installed</p> <p>Things that are working:</p> <ul> <li>I can SSH from load1 onto load2 and vice versa</li> <li>I can start an erl sessions on load1 and load2</li> <li>I can start an erl session on load2 from load1; ssh load2 erl</li> <li>I can successfully ping load2 from load1 from an erl session using the same cookie on both nodes.</li> </ul> <p>Ping output:</p> <pre><code>musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads: 0] [hipe] [kernel-poll:false] Eshell V5.9.1 (abort with ^G) (musicglue@load1)1&gt; net:ping(musicglue@load2). pong </code></pre> <h2>THE ISSUE</h2> <p>My problem occurs when attempting to start a slave session from load1 on load2:</p> <pre><code>musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads: 0] [hipe] [kernel-poll:false] Eshell V5.9.1 (abort with ^G) (musicglue@load1)1&gt; net:ping(musicglue@load2). pong (musicglue@load1)2&gt; slave:start(load2,musicglue,"-setcookie tom"). {error,timeout} </code></pre> <p>Here is the output I get from epmd when I run the slave:start command:</p> <pre><code>epmd: Thu May 24 10:01:57 2012: Non-local peer connected epmd: Thu May 24 10:01:57 2012: opening connection on file descriptor 4 epmd: Thu May 24 10:01:57 2012: got 12 bytes ***** 00000000 00 0a 7a 6d 75 73 69 63 67 6c 75 65 |..zmusicglue| epmd: Thu May 24 10:01:57 2012: ** got PORT2_REQ epmd: Thu May 24 10:01:57 2012: got 2 bytes ***** 00000000 77 01 |w.| epmd: Thu May 24 10:01:57 2012: ** sent PORT2_RESP (error) for "musicglue" epmd: Thu May 24 10:01:57 2012: closing connection on file descriptor 4 epmd: Thu May 24 10:01:57 2012: Local peer connected epmd: Thu May 24 10:01:57 2012: opening connection on file descriptor 4 epmd: Thu May 24 10:01:57 2012: got 24 bytes ***** 00000000 00 16 78 ca d6 4d 00 00 05 00 05 00 09 6d 75 73 |..x..M.......mus| ***** 00000010 69 63 67 6c 75 65 00 00 | icglue..| epmd: Thu May 24 10:01:57 2012: ** got ALIVE2_REQ epmd: Thu May 24 10:01:57 2012: registering 'musicglue:1', port 51926 epmd: Thu May 24 10:01:57 2012: type 77 proto 0 highvsn 5 lowvsn 5 epmd: Thu May 24 10:01:57 2012: got 4 bytes ***** 00000000 79 00 00 01 | y...| epmd: Thu May 24 10:01:57 2012: ** sent ALIVE2_RESP for "musicglue" epmd: Thu May 24 10:01:57 2012: unregistering 'musicglue:1', port 51926 epmd: Thu May 24 10:01:57 2012: closing connection on file descriptor 4 </code></pre> <p>Any help or suggestions anyone has would be much appreciated,</p> <p>Many thanks</p> <h2>EDIT</h2> <p>I should also mention that I can see the ssh connection being successfully acknowledged by load2 but then immediately disconnecting:</p> <pre><code>May 30 13:49:27 load2 sshd[16169]: Accepted publickey for musicglue from 173.45.236.182 port 51843 ssh2 May 30 13:49:27 load2 sshd[16171]: Received disconnect from 173.45.236.182: 11: disconnected by user </code></pre> <p>In response to below comments I have also tried to start the slave using different node names for the slave:</p> <pre><code>musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.9.1 (abort with ^G) (musicglue@load1)1&gt; slave:start(load2,bar,"-setcookie tom"). {error,timeout} </code></pre> <p>and for the controller:</p> <pre><code>musicglue@load1:~$ erl -rsh ssh -sname foo -setcookie tom Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.9.1 (abort with ^G) (foo@load1)1&gt; slave:start(load2,musicglue,"-setcookie tom"). {error,timeout} </code></pre> <p>and for both:</p> <pre><code>musicglue@load1:~$ erl -rsh ssh -sname foo -setcookie tom Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.9.1 (abort with ^G) (foo@load1)1&gt; slave:start(load2,bar,"-setcookie tom"). {error,timeout} </code></pre> <p>But to no avail</p> <h2>SOLUTION</h2> <p>Turns out that my problem was that my slave was unable to SSH onto the controller and therefore could not respond to any commands. </p> <p>After fixing this port of communication between the two nodes everyone worked perfectly.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload