Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to plot quantil band (in R)
    primarykey
    data
    text
    <p>I have a CSV file which contains lines for each (Java GC) Event I am interested in. The object consists of a subsecond timestamp (non equidistant) and some variables. The object looks like this:</p> <pre><code>gcdata &lt;- read.table("http://bernd.eckenfels.net/view/gc1001.ygc.csv",header=TRUE,sep=",", dec=".") start = as.POSIXct(strptime("2012-01-01 00:00:00", format="%Y-%m-%d %H:%M:%S")) gcdata.date = gcdata$Timestamp + start gcdata = gcdata[,2:7] # remove old date col gcdata=data.frame(date=gcdata.date,gcdata) str(gcdata) </code></pre> <p>Results in</p> <pre><code>'data.frame': 2997 obs. of 7 variables: $ date : POSIXct, format: "2012-01-01 00:00:06" "2012-01-01 00:00:06" "2012-01-01 00:00:18" ... $ Distance.s. : num 0 0.165 11.289 9.029 11.161 ... $ YGUsedBefore.K.: int 1610619 20140726 20148325 20213304 20310849 20404772 20561918 21115577 21479211 21544930 ... $ YGUsedAfter.K. : int 7990 15589 80568 178113 272036 429182 982841 1346475 1412181 1355412 ... $ Promoted.K. : int 0 0 0 0 8226 937 65429 71166 62548 143638 ... $ YGCapacity.K. : int 22649280 22649280 22649280 22649280 22649280 22649280 22649280 22649280 22649280 22649280 ... $ Pause.s. : num 0.0379 0.022 0.0287 0.0509 0.109 ... </code></pre> <p>In this case I care about the Pause time (in seconds). I want to plot a diagram, which will show me for each (wall clock) hour basically the mean as a line, the 2% and 98% as a grey corridor and the max value (within each hour) as a red line.</p> <p>I have done some work, but using the q98 functions is ugly, having to use multiple lines statements seems to be wastefull, and I dont know how to achieve a grey area between q02 and q98:</p> <pre><code>q02 &lt;- function(x, ...) { x &lt;- quantile(x,probs=c(0.2)) } q98 &lt;- function(x, ...) { x &lt;- quantile(x,probs=c(0.98)) } hours = droplevels(cut(gcdata$date, breaks="hours")) # can I have 2 hours? plot(aggregate(gcdata$Pause.s. ~ hours, data=gcdata, FUN=max),ylim=c(0,2), col="red", ylab="Pause(s)", xlab="Days") # Is always black? lines(aggregate(gcdata$Pause.s. ~ hours, data=gcdata, FUN=q98),ylim=c(0,2), col="green") lines(aggregate(gcdata$Pause.s. ~ hours, data=gcdata, FUN=q02),ylim=c(0,2), col="green") lines(aggregate(gcdata$Pause.s. ~ hours, data=gcdata, FUN=mean),ylim=c(0,2), col="blue") </code></pre> <p>Now this results in a chart which has black dots as maximum, a blue line as the hourly average and a lower and upper 0,2 + 0,98 green line. I think it would be better readable to have a grey corridor, maybe a dashed maximum (red) line and somehow fix the axis labels. <img src="https://i.stack.imgur.com/hxixI.png" alt="Exported Chart (png)"> Any suggestions? (the file is available above)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload