StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POComputer graphing utilities
text
Body
copied!<p>I have developed a system in R for graphing large datasets obtained from wind turbines. I am now porting the process into Java. The results I get between the two systems are inconsistent. </p> <p>As shown below: </p> <ul> <li>The dataset is first plotted using using R, and secondly using JFreeChart. </li> <li>The red line in both graphs correspond to my respective calculations in each language (which are detailed below). </li> <li>The brown dashed line in #1 corresponds to the blue line in #2, there are no discrepancies here, they are provided for reference</li> <li>The shaded area represent the data points, grey in #1 and red in #2. <img src="https://i.stack.imgur.com/7UQgY.png" alt="Dataset graphed using R"> <img src="https://i.stack.imgur.com/XR5Uc.png" alt="Dataset graphed using JFreeCharts"></li> </ul> <p>I can explain the discrepancies between the (red) calculated lines and that is due to the fact that I am using different calculation methods. </p> <p>In R the data is processed as follows, I wrote this code <a href="https://stackoverflow.com/questions/4843194/r-language-sorting-data-into-ranges-averaging-ignore-outliers/4844566#4844566">with a little help</a> and have no idea what is going on here (but hey, it works). </p> <pre><code>df <- data.frame(pwr = pwr, spd = spd) require(mgcv) mod <- gam(pwr ~ s(spd, bs = "ad", k = 20), data = df, method = "REML") summary(mod) x_grid <- with(df, data.frame(spd = seq(min(spd) + 0.0001, maxi, length=100))) pred <- predict(mod, x_grid, se.fit = TRUE) x_grid <- within(x_grid, fit <- pred$fit) lines(fit ~ spd, data = x_grid, col = "red", lwd = thickLineWidth) </code></pre> <p>In Java (SQL infact) I am using the method of bins to calculate the average at every 0.5 on the x-axis. The resulting data is plotted using a <code>org.jfree.chart.renderer.xy.XYSplineRenderer</code> I do not know too much about how the line is rendered.</p> <pre><code>SELECT ROUND( ROUND( x_data * 2 ) / 2, 1) AS x_axis, # See https://stackoverflow.com/questions/5230647/mysql-rounding-functions AVG( y_data ) AS y_axis FROM table GROUP BY x_axis </code></pre> <p>My take on the variance between the two graphs: </p> <ul> <li>Presence of a single outlier at 18 on the x_axis (most visible on the R graph) seems to have an enormous impact on the shape of the curve.</li> <li>Even between 5 and 15 on the x-axis it seems that the line in the R graph is more continuous, it doesn't change trajectory as readily as that produced by Java. </li> <li>The 'crater' evident at 18 on the java x-axis has to 'mounds' each side of it, I believe this is due to polynomial effects in the rendering system.</li> </ul> <p>These are things that I would like to eliminate.</p> <p>So in an effort to understand the difference between the two graphs I have a few questions:</p> <ul> <li>Exactly what is going on in my R script?</li> <li>How can I, or, do I want to port the same process to my Java code?</li> <li>Can anyone explain the spline system used by JFreeCharts, is there another?</li> </ul>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload