Note that there are some explanatory texts on larger screens.

plurals
  1. POCalculate Linear regression on data set in Map Reduce
    primarykey
    data
    text
    <p>Say I have a input as follows:</p> <pre><code>60,3.1 61,3.6 62,3.8 63,4 65,4.1 </code></pre> <p>Ouput is expected as follows:</p> <p>Expected output: y = -8.098 + 0.19x.</p> <p>I know how to do this in java. But don't know how this work with mapreduce model. Can any one give idea or sample Map Reduce code on this problem? I will appreciate this.</p> <p>This simple mathematical example:</p> <pre><code>Regression Formula: Regression Equation(y) = a + bx Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2) Intercept(a) = (ΣY - b(ΣX)) / N where x and y are the variables. b = The slope of the regression line a = The intercept point of the regression line and the y axis. N = Number of values or elements X = First Score Y = Second Score ΣXY = Sum of the product of first and Second Scores ΣX = Sum of First Scores ΣY = Sum of Second Scores ΣX2 = Sum of square First Scores </code></pre> <p>e.g.</p> <pre><code>X Values Y Values 60 3.1 61 3.6 62 3.8 63 4 65 4.1 </code></pre> <p>To find regression equation, we will first find slope, intercept and use it to form regression equation..</p> <pre><code>Step 1: Count the number of values. N = 5 Step 2: Find XY, X2 See the below table X Value Y Value X*Y X*X 60 3.1 60 * 3.1 = 186 60 * 60 = 3600 61 3.6 61 * 3.6 = 219.6 61 * 61 = 3721 62 3.8 62 * 3.8 = 235.6 62 * 62 = 3844 63 4 63 * 4 = 252 63 * 63 = 3969 65 4.1 65 * 4.1 = 266.5 65 * 65 = 4225 Step 3: Find ΣX, ΣY, ΣXY, ΣX2. ΣX = 311 ΣY = 18.6 ΣXY = 1159.7 ΣX2 = 19359 Step 4: Substitute in the above slope formula given. Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2) = ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)2) = (5798.5 - 5784.6)/(96795 - 96721) = 13.9/74 = 0.19 Step 5: Now, again substitute in the above intercept formula given. Intercept(a) = (ΣY - b(ΣX)) / N = (18.6 - 0.19(311))/5 = (18.6 - 59.09)/5 = -40.49/5 = -8.098 Step 6: Then substitute these values in regression equation formula Regression Equation(y) = a + bx = -8.098 + 0.19x. </code></pre> <p>Suppose if we want to know the approximate y value for the variable x = 64. Then we can substitute the value in the above equation.</p> <pre><code> Regression Equation(y) = a + bx = -8.098 + 0.19(64). = -8.098 + 12.16 = 4.06 </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload