Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Some thoughts, and my experiences thus far in doing a similar experiment (worked through in a Spike during a Sprint):</p> <ul> <li>From my experiences (I could be wrong), you don't really spin up more bolts as demand increases, but rather you adjust the parallelism configurations of each one in the topology. Topologies are not scaled by adding more Bolts, rather they are scaled by <em>increasing the parallelism</em> for whatever bolt is the bottleneck. Take the example word count problem:</li> </ul> <blockquote> <pre><code>builder.setBolt(4, new MyBolt(), 12) .shuffleGrouping(1) .shuffleGrouping(2) .fieldsGrouping(3, new Fields("id1", "id2")); </code></pre> </blockquote> <p>That last parameter (the "12") is the parallelism of that bolt. If it's a bottleneck in the topology and you need to scale up to meet demand, you increase this. A parallelism of 12 means it will result in 12 threads executing the bolt in parallel across the storm cluster. </p> <ul> <li>In 0.8.0 you can use "Executors", which also allow for adjustments "on the fly" to help scale a bolt/etc up/down. Example:</li> </ul> <blockquote> <p>builder.setBolt(new MyBolt(), 3) .setNumTasks(64) .shuffleGrouping("someSpout");</p> </blockquote> <p>Here, the number of executors (threads) for <code>MyBolt()</code> is 3, and you can change the number of threads dynamically without affecting the topology. <code>storm rebalance</code> is used for this:</p> <pre><code>$ storm rebalance someTopology -n 6 -e mySpout=4 -e myBolt=6 </code></pre> <p>This changes the number of workers for the "someTopology" topology to 6, the number of executors/threads for mySpout to 4, and the number of executors/threads for myBolt to 6.</p> <ul> <li>It sounds like your storm topology would process on the streaming data. Data that requires batch processing would be kicked off after it's been persisted to whatever datastore (HDFS) you are using. In that case, you would wrap a bolt to do persistence to the datastore for whatever data is needed.</li> <li>If, on the other hand, you want to do some sort of incremental processing on top of whatever datastore you already have (and remain stateful), use Trident (<a href="https://github.com/nathanmarz/storm/wiki/Trident-tutorial" rel="nofollow">https://github.com/nathanmarz/storm/wiki/Trident-tutorial</a>). Trident might actually solve a lot of the questions you have.</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload