StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow do you programmatically configure hazelcast for the multicast discovery mechanism?
primarykey
Id
20385973
data
AcceptedAnswerId
20716919
AnswerCount
7
ClosedDate
CommentCount
6
CommunityOwnedDate
2013-12-19T12:06:10.750
CreationDate
2013-12-04T21:07:15.807
FavoriteCount
5
LastActivityDate
2014-06-30T12:31:18.473
LastEditDate
2014-06-30T12:31:18.473
LastEditorUserId
1703313
OwnerUserId
750378
ParentId
0
PostTypeId
1
Score
21
ViewCount
31934
LastEditorDisplayName
text
Body
How do you programmatically configure hazelcast for the multicast discovery mechanism? <hr> Details: The <a href="http://www.hazelcast.com/docs/1.9.4/manual/multi_html/ch11.html#ConfigFullTcpIp">documentation</a> only supplies an example for TCP/IP and is out-of-date: it uses Config.setPort(), which no longer exists. My configuration looks like this, but discovery does not work (i.e. I get the output <code>"Members: 1"</code>: <pre><code> Config cfg = new Config(); NetworkConfig network = cfg.getNetworkConfig(); network.setPort(PORT_NUMBER); JoinConfig join = network.getJoin(); join.getTcpIpConfig().setEnabled(false); join.getAwsConfig().setEnabled(false); join.getMulticastConfig().setEnabled(true); join.getMulticastConfig().setMulticastGroup(MULTICAST_ADDRESS); join.getMulticastConfig().setMulticastPort(PORT_NUMBER); join.getMulticastConfig().setMulticastTimeoutSeconds(200); HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg); System.out.println("Members: "+hazelInst.getCluster().getMembers().size()); </code></pre> <hr> <h3>Update 1, taking asimarslan's answer into account</h3> If I fumbled with the MulticastTimeout, I either get <code>"Members: 1"</code> or <blockquote> Dec 05, 2013 8:50:42 PM com.hazelcast.nio.ReadHandler WARNING: [192.168.0.9]:4446 [dev] hz._hzInstance_1_dev.IO.thread-in-0 Closing socket to endpoint Address[192.168.0.7]:4446, Cause:java.io.EOFException: Remote socket closed! Dec 05, 2013 8:57:24 PM com.hazelcast.instance.Node SEVERE: [192.168.0.9]:4446 [dev] Could not join cluster, shutting down! com.hazelcast.core.HazelcastException: Failed to join in 300 seconds! </blockquote> <hr> <h3>Update 2, taking pveentjer's answer about using tcp/ip into account</h3> If I change the configuration to the following, I still only get 1 member: <pre><code>Config cfg = new Config(); NetworkConfig network = cfg.getNetworkConfig(); network.setPort(PORT_NUMBER); JoinConfig join = network.getJoin(); join.getMulticastConfig().setEnabled(false); join.getTcpIpConfig().addMember("192.168.0.1").addMember("192.168.0.2"). addMember("192.168.0.3").addMember("192.168.0.4"). addMember("192.168.0.5").addMember("192.168.0.6"). addMember("192.168.0.7").addMember("192.168.0.8"). addMember("192.168.0.9").addMember("192.168.0.10"). addMember("192.168.0.11").setRequiredMember(null).setEnabled(true); //this sets the allowed connections to the cluster? necessary for multicast, too? network.getInterfaces().setEnabled(true).addInterface("192.168.0.*"); HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg); System.out.println("debug: joined via "+join+" with "+hazelInst.getCluster() .getMembers().size()+" members."); </code></pre> More precisely, this run produces the output <blockquote> debug: joined via JoinConfig{multicastConfig=MulticastConfig [enabled=false, multicastGroup=224.2.2.3, multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=true, connectionTimeoutSeconds=5, members=[192.168.0.1, 192.168.0.2, 192.168.0.3, 192.168.0.4, 192.168.0.5, 192.168.0.6, 192.168.0.7, 192.168.0.8, 192.168.0.9, 192.168.0.10, 192.168.0.11], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='null', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}} with 1 members. </blockquote> My non-hazelcast-implementation is using UDP multicasts and works fine. So can a firewall really be the problem? <hr> <h3>Update 3, taking pveentjer's answer about checking the network into account</h3> Since I do not have permissions for iptables or to install iperf, I am using <code>com.hazelcast.examples.TestApp</code> to check whether my network is working, as described in <a href="http://books.google.de/books?id=tzAa_MxjNrsC&pg=PT12&lpg=PT12&dq=%22getting+started+with+hazelcast%22">Getting Started With Hazelcast</a> in Chapter 2, Section "Showing Off Straight Away": I call <code>java -cp hazelcast-3.1.2.jar com.hazelcast.examples.TestApp</code> on 192.168.0.1 and get the output <pre><code>...Dec 10, 2013 11:31:21 PM com.hazelcast.instance.DefaultAddressPicker INFO: Prefer IPv4 stack is true. Dec 10, 2013 11:31:21 PM com.hazelcast.instance.DefaultAddressPicker INFO: Picked Address[192.168.0.1]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true Dec 10, 2013 11:31:22 PM com.hazelcast.system INFO: [192.168.0.1]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.0.1]:5701 Dec 10, 2013 11:31:22 PM com.hazelcast.system INFO: [192.168.0.1]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com Dec 10, 2013 11:31:22 PM com.hazelcast.instance.Node INFO: [192.168.0.1]:5701 [dev] Creating MulticastJoiner Dec 10, 2013 11:31:22 PM com.hazelcast.core.LifecycleService INFO: [192.168.0.1]:5701 [dev] Address[192.168.0.1]:5701 is STARTING Dec 10, 2013 11:31:24 PM com.hazelcast.cluster.MulticastJoiner INFO: [192.168.0.1]:5701 [dev] Members [1] { Member [192.168.0.1]:5701 this } Dec 10, 2013 11:31:24 PM com.hazelcast.core.LifecycleService INFO: [192.168.0.1]:5701 [dev] Address[192.168.0.1]:5701 is STARTED </code></pre> I then call <code>java -cp hazelcast-3.1.2.jar com.hazelcast.examples.TestApp</code> on 192.168.0.2 and get the output <pre><code>...Dec 10, 2013 9:50:22 PM com.hazelcast.instance.DefaultAddressPicker INFO: Prefer IPv4 stack is true. Dec 10, 2013 9:50:22 PM com.hazelcast.instance.DefaultAddressPicker INFO: Picked Address[192.168.0.2]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true Dec 10, 2013 9:50:23 PM com.hazelcast.system INFO: [192.168.0.2]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.0.2]:5701 Dec 10, 2013 9:50:23 PM com.hazelcast.system INFO: [192.168.0.2]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com Dec 10, 2013 9:50:23 PM com.hazelcast.instance.Node INFO: [192.168.0.2]:5701 [dev] Creating MulticastJoiner Dec 10, 2013 9:50:23 PM com.hazelcast.core.LifecycleService INFO: [192.168.0.2]:5701 [dev] Address[192.168.0.2]:5701 is STARTING Dec 10, 2013 9:50:23 PM com.hazelcast.nio.SocketConnector INFO: [192.168.0.2]:5701 [dev] Connecting to /192.168.0.1:5701, timeout: 0, bind-any: true Dec 10, 2013 9:50:23 PM com.hazelcast.nio.TcpIpConnectionManager INFO: [192.168.0.2]:5701 [dev] 38476 accepted socket connection from /192.168.0.1:5701 Dec 10, 2013 9:50:28 PM com.hazelcast.cluster.ClusterService INFO: [192.168.0.2]:5701 [dev] Members [2] { Member [192.168.0.1]:5701 Member [192.168.0.2]:5701 this } Dec 10, 2013 9:50:30 PM com.hazelcast.core.LifecycleService INFO: [192.168.0.2]:5701 [dev] Address[192.168.0.2]:5701 is STARTED </code></pre> So multicast discovery is generally working on my cluster, right? Is 5701 also the port for discovery? Is <code>38476</code> in the last output an ID or a port? Joining still does not work for my own code with programmatical configuration :( <hr> <h3>Update 4, taking pveentjer's answer about using the default configuration into account</h3> The modified TestApp gives the output <pre><code>joinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSeconds=5, members=[], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='null', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}} </code></pre> and does detect other members after a couple of seconds (after each instance once lists only itself as a member if all are started at the same time), whereas myProgram gives the output <pre><code>joined via JoinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multica\ stTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSecond\ s=5, members=[], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='nu\ ll', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}} with 1 members. </code></pre> and does not detect members within its runtime of about 1 minute (I am counting the members about every 5 seconds). BUT if at least one instance of TestApp runs concurrently on the cluster, all TestApp instances and all myProgram instances are detected and my program works fine. In case I start TestApp once and then myProgram twice in parallel, TestApp gives the following output: <pre><code>java -cp ~/CaseStudy/jtorx-1.10.0-beta8/lib/hazelcast-3.1.2.jar:. TestApp Dec 12, 2013 12:02:15 PM com.hazelcast.instance.DefaultAddressPicker INFO: Prefer IPv4 stack is true. Dec 12, 2013 12:02:15 PM com.hazelcast.instance.DefaultAddressPicker INFO: Picked Address[192.168.180.240]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true Dec 12, 2013 12:02:15 PM com.hazelcast.system INFO: [192.168.180.240]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.180.240]:5701 Dec 12, 2013 12:02:15 PM com.hazelcast.system INFO: [192.168.180.240]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com Dec 12, 2013 12:02:15 PM com.hazelcast.instance.Node INFO: [192.168.180.240]:5701 [dev] Creating MulticastJoiner Dec 12, 2013 12:02:15 PM com.hazelcast.core.LifecycleService INFO: [192.168.180.240]:5701 [dev] Address[192.168.180.240]:5701 is STARTING Dec 12, 2013 12:02:21 PM com.hazelcast.cluster.MulticastJoiner INFO: [192.168.180.240]:5701 [dev] Members [1] { Member [192.168.180.240]:5701 this } Dec 12, 2013 12:02:22 PM com.hazelcast.core.LifecycleService INFO: [192.168.180.240]:5701 [dev] Address[192.168.180.240]:5701 is STARTED Dec 12, 2013 12:02:22 PM com.hazelcast.management.ManagementCenterService INFO: [192.168.180.240]:5701 [dev] Hazelcast will connect to Management Center on address: http://localhost:8080/mancenter-3.1.2/ Join: JoinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSeconds=5, members=[], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='null', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}} Dec 12, 2013 12:02:22 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] Initializing cluster partition table first arrangement... hazelcast[default] > Dec 12, 2013 12:03:27 PM com.hazelcast.nio.SocketAcceptor INFO: [192.168.180.240]:5701 [dev] Accepting socket connection from /192.168.0.8:38764 Dec 12, 2013 12:03:27 PM com.hazelcast.nio.TcpIpConnectionManager INFO: [192.168.180.240]:5701 [dev] 5701 accepted socket connection from /192.168.0.8:38764 Dec 12, 2013 12:03:27 PM com.hazelcast.nio.SocketAcceptor INFO: [192.168.180.240]:5701 [dev] Accepting socket connection from /192.168.0.7:54436 Dec 12, 2013 12:03:27 PM com.hazelcast.nio.TcpIpConnectionManager INFO: [192.168.180.240]:5701 [dev] 5701 accepted socket connection from /192.168.0.7:54436 Dec 12, 2013 12:03:32 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] Re-partitioning cluster data... Migration queue size: 181 Dec 12, 2013 12:03:32 PM com.hazelcast.cluster.ClusterService INFO: [192.168.180.240]:5701 [dev] Members [3] { Member [192.168.180.240]:5701 this Member [192.168.0.8]:5701 Member [192.168.0.7]:5701 } Dec 12, 2013 12:03:43 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] Re-partitioning cluster data... Migration queue size: 181 Dec 12, 2013 12:03:45 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] All migration tasks has been completed, queues are empty. Dec 12, 2013 12:03:46 PM com.hazelcast.nio.TcpIpConnection INFO: [192.168.180.240]:5701 [dev] Connection [Address[192.168.0.8]:5701] lost. Reason: Socket explicitly closed Dec 12, 2013 12:03:46 PM com.hazelcast.cluster.ClusterService INFO: [192.168.180.240]:5701 [dev] Removing Member [192.168.0.8]:5701 Dec 12, 2013 12:03:46 PM com.hazelcast.cluster.ClusterService INFO: [192.168.180.240]:5701 [dev] Members [2] { Member [192.168.180.240]:5701 this Member [192.168.0.7]:5701 } Dec 12, 2013 12:03:48 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] Partition balance is ok, no need to re-partition cluster data... Dec 12, 2013 12:03:48 PM com.hazelcast.nio.TcpIpConnection INFO: [192.168.180.240]:5701 [dev] Connection [Address[192.168.0.7]:5701] lost. Reason: Socket explicitly closed Dec 12, 2013 12:03:48 PM com.hazelcast.cluster.ClusterService INFO: [192.168.180.240]:5701 [dev] Removing Member [192.168.0.7]:5701 Dec 12, 2013 12:03:48 PM com.hazelcast.cluster.ClusterService INFO: [192.168.180.240]:5701 [dev] Members [1] { Member [192.168.180.240]:5701 this } Dec 12, 2013 12:03:48 PM com.hazelcast.partition.PartitionService INFO: [192.168.180.240]:5701 [dev] Partition balance is ok, no need to re-partition cluster data... </code></pre> The only difference I see in TestApp's configuration is <pre><code>config.getManagementCenterConfig().setEnabled(true); config.getManagementCenterConfig().setUrl("http://localhost:8080/mancenter-"+version); for(int k=1;k<= LOAD_EXECUTORS_COUNT;k++){ config.addExecutorConfig(new ExecutorConfig("e"+k).setPoolSize(k)); } </code></pre> so I added it in a desperate attempt into myProgram, too. But it does not solve the problem - still each instance only detects itself as member during the whole run. <hr> <h3>Update about how long myProgram runs</h3> Could it be that the program is not running long enough (as pveentjer put it)? My experiments seem to confirm this: If the time t between <code>Hazelcast.newHazelcastInstance(cfg);</code> and initializing <code>cleanUp()</code> (i.e. no longer communicating via hazelcast and no longer checking the number of members) is <ul> <li>less than 30 seconds, no communication and <code>members: 1</code></li> <li>more than 30 seconds: all members are found and communication happens (which weirdly seems to be happening for much longer than t - 30 seconds). </li> </ul> Is 30 seconds a realistic time span that a hazelcast cluster needs, or is there something strange going on? Here is a log from 4 myPrograms running concurrently (looking for hazelcast-members overlaps 30 seconds for instance 1 and instance 3): <pre><code>instance 1: 2013-12-19T12:39:16.553+0100 LOG 0 (START) engine started looking for members between 2013-12-19T12:39:21.973+0100 and 2013-12-19T12:40:27.863+0100 2013-12-19T12:40:28.205+0100 LOG 35 (Torx-Explorer) Model SymToSim is about to\ exit instance 2: 2013-12-19T12:39:16.592+0100 LOG 0 (START) engine started looking for members between 2013-12-19T12:39:22.192+0100 and 2013-12-19T12:39:28.429+0100 2013-12-19T12:39:28.711+0100 LOG 52 (Torx-Explorer) Model SymToSim is about to\ exit instance 3: 2013-12-19T12:39:16.593+0100 LOG 0 (START) engine started looking for members between 2013-12-19T12:39:22.145+0100 and 2013-12-19T12:39:52.425+0100 2013-12-19T12:39:52.639+0100 LOG 54 (Torx-Explorer) Model SymToSim is about to\ exit INSTANCE 4: 2013-12-19T12:39:16.885+0100 LOG 0 (START) engine started looking for members between 2013-12-19T12:39:21.478+0100 and 2013-12-19T12:39:35.980+0100 2013-12-19T12:39:36.024+0100 LOG 34 (Torx-Explorer) Model SymToSim is about to\ exit </code></pre> How do I best start my actual distributed algorithm only after enough members are present in the hazelcast cluster? Can I set <code>hazelcast.initial.min.cluster.size</code> programmatically? <a href="https://groups.google.com/forum/#!topic/hazelcast/sa-lmpEDa6A">https://groups.google.com/forum/#!topic/hazelcast/sa-lmpEDa6A</a> sounds like this would block <code>Hazelcast.newHazelcastInstance(cfg);</code> until the initial.min.cluster.size is reached. Correct? How synchronously (within which time span) will the different instances unblock?
Tags
<java><configuration><multicast><hazelcast>
Title
How do you programmatically configure hazelcast for the multicast discovery mechanism?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USFrancois Bourgeois
UserOwnerUserId
1. USDaveFar
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow do you programmatically configure hazelcast for the multicast discovery mechanism?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POHow do you programmatically configure hazelcast for the multicast discovery mechanism?
 UserUserId
 USDaveFar
 VoteTypeVoteTypeId
 VTBountyStart
3. VO
 singulars
 PostPostId
 POHow do you programmatically configure hazelcast for the multicast discovery mechanism?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTDownMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.