Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The cloud concept of map/reduce is very similar, but changed to work in parallel. First, each data object is passed through a function that <code>map</code>s it to a new object (usually, some sort of dictionary). Then, a <code>reduce</code> function is called on pairs of the objects returned by <code>map</code> until there is only one left. That is the result of the map/reduce operation.</p> <p>One important consideration is that, because of the parallelization, the <code>reduce</code> function must be able to take in objects from the <code>map</code> function as well as objects from prior <code>reduce</code> functions. This makes more sense when you think about how the parallelization goes. Many machines will each reduce their data to a single object, and those objects will then be reduced to a final output. Of course, this may happen in more than one layer if there is a lot of data.</p> <p>Here's a simple example of how you might use the map/reduce framework to count words in a list:</p> <pre><code>list = ['a', 'foo', 'bar', 'foobar', 'foo', 'a', 'bar', 'bar', 'bar', 'bar', 'foo'] list2 = ['b', 'foo', 'foo', 'b', 'a', 'bar'] </code></pre> <p>The map function would look like this:</p> <pre><code>def wordToDict(word): return {word: 1} </code></pre> <p>And the reduce function would look like this:</p> <pre><code>def countReduce(d1, d2): out = d1.copy() for key in d2: if key in out: out[key] += d2[key] else: out[key] = d2[key] return out </code></pre> <p>Then you can map/reduce like this:</p> <pre><code>reduce(countReduce, map(wordToDict, list + list2)) &gt;&gt;&gt; {'a': 3, 'foobar': 1, 'b': 2, 'bar': 6, 'foo': 5} </code></pre> <p>But you can also do it like this (which is what parallelization would do):</p> <pre><code>reduce(countReduce, [reduce(countReduce, map(wordToDict, list)), reduce(countReduce, map(wordToDict, list2))]) &gt;&gt;&gt; {'a': 3, 'foobar': 1, 'b': 2, 'foo': 5, 'bar': 6} </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload