Note that there are some explanatory texts on larger screens.

plurals
  1. POPolynomial Regression nonsense Predictions
    text
    copied!<p>Suppose I want to fit a linear regression model with degree two (orthogonal) polynomial and then predict the response. Here are the codes for the first model (m1)</p> <pre><code>x=1:100 y=-2+3*x-5*x^2+rnorm(100) m1=lm(y~poly(x,2)) prd.1=predict(m1,newdata=data.frame(x=105:110)) </code></pre> <p>Now let's try the same model but instead of using $poly(x,2)$, I will use its columns like:</p> <pre><code>m2=lm(y~poly(x,2)[,1]+poly(x,2)[,2]) prd.2=predict(m2,newdata=data.frame(x=105:110)) </code></pre> <p>Let's look at the summaries of m1 and m2.</p> <pre><code>&gt; summary(m1) Call: lm(formula = y ~ poly(x, 2)) Residuals: Min 1Q Median 3Q Max -2.50347 -0.48752 -0.07085 0.53624 2.96516 Coefficients: Estimate Std. Error t value Pr(&gt;|t|) (Intercept) -1.677e+04 9.912e-02 -169168 &lt;2e-16 *** poly(x, 2)1 -1.449e+05 9.912e-01 -146195 &lt;2e-16 *** poly(x, 2)2 -3.726e+04 9.912e-01 -37588 &lt;2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9912 on 97 degrees of freedom Multiple R-squared: 1, Adjusted R-squared: 1 F-statistic: 1.139e+10 on 2 and 97 DF, p-value: &lt; 2.2e-16 &gt; summary(m2) Call: lm(formula = y ~ poly(x, 2)[, 1] + poly(x, 2)[, 2]) Residuals: Min 1Q Median 3Q Max -2.50347 -0.48752 -0.07085 0.53624 2.96516 Coefficients: Estimate Std. Error t value Pr(&gt;|t|) (Intercept) -1.677e+04 9.912e-02 -169168 &lt;2e-16 *** poly(x, 2)[, 1] -1.449e+05 9.912e-01 -146195 &lt;2e-16 *** poly(x, 2)[, 2] -3.726e+04 9.912e-01 -37588 &lt;2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9912 on 97 degrees of freedom Multiple R-squared: 1, Adjusted R-squared: 1 F-statistic: 1.139e+10 on 2 and 97 DF, p-value: &lt; 2.2e-16 </code></pre> <p>So m1 and m2 are basically the same. Now let's look at the predictions prd.1 and prd.2</p> <pre><code>&gt; prd.1 1 2 3 4 5 6 -54811.60 -55863.58 -56925.56 -57997.54 -59079.52 -60171.50 &gt; prd.2 1 2 3 4 5 6 49505.92 39256.72 16812.28 -17827.42 -64662.35 -123692.53 </code></pre> <p>Q1: Why prd.2 is significantly different from prd.1? </p> <p>Q2: How can I obtain prd.1 using the model m2?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload