StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>In the R code, you are (well I was when I showed the example) fitting an additive model to the power and speed data, where the relationship between the variables is determined from the data themselves. These models involve the use of splines to estimate the response function. In particular here I used an adaptive smoother with <code>k = 20</code> the complexity of the smoother fitting. The more complex the smoother, the more wiggly the fitted function <em>can</em> be. An adaptive smoother is one where the degree of smoothness varies across the fitted function.</p> <p>Why is this important? Well, from your data, there are periods where the response does not vary with the speed variable, and periods where the response changes rapidly with a change in speed. We have an "allowance" of wigglyness to use up over the curve. With ordinary splines the wigglyness (or smoothness) is the same across the entire function. With an adaptive smooth we can use more of our wigglyness allowance in the parts of the function where the response is changing/varying most, and not spend any of the allowance where it is not needed in the parts where the response isn't changing.</p> <p>Below I annotate the code to explain what is being done at each step:</p> <pre><code>## here we create a data frame with the pwr and spd variables df <- data.frame(pwr = pwr, spd = spd) ## we load the package containing the code to fit the additive model require(mgcv) ## This is the model itself, saying pwr is modelled as a smooth function of spd ## and the smooth function of spd is generated using an adaptive smoother with ## and "allowance" of 20. This allowance is a starting point and the actual ## smoothness of the curve will be estimated as part of the model fitting, ## here using a REML criterion mod <- gam(pwr ~ s(spd, bs = "ad", k = 20), data = df, method = "REML") ## This just summarise the model fit summary(mod) ## In this line we are creating a new spd vector (in a data frame) that contains ## 100 equally spaced spd values over the entire range of the observed spd x_grid <- with(df, data.frame(spd = seq(min(spd) + 0.0001, maxi, length=100))) ## we will use those data to get predictions of the response pwr at each ## of the 100 values of spd we just created ## I did this so we had enough data to plot a nice smooth curve, but without ## having to predict for all the observed values of spd pred <- predict(mod, x_grid, se.fit = TRUE) ## This line stores the 100 predicted values in the prediction data object x_grid <- within(x_grid, fit <- pred$fit) ## This line draws the fitted smooth on to a plot of the data ## this assumes there is already a plot on the active device. lines(fit ~ spd, data = x_grid, col = "red", lwd = thickLineWidth) </code></pre> <p>If you are not familiar with additive models and smoothers/splines then I recommend Ruppert, Wand and Carroll (2003) <a href="http://rads.stackoverflow.com/amzn/click/0521785162" rel="nofollow">Semiparametric Regression</a>. Cambridge University Press.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload