Note that there are some explanatory texts on larger screens.

plurals
  1. POCounting number of events underway at a timestamp
    primarykey
    data
    text
    <p>I have a series of timestamps marking the beginning and end of certain events.</p> <pre><code>library(chron) start &lt;- structure(c(14246.3805439815, 14246.3902662037, 14246.3909606481, 14246.3992939815, 14246.4013773148, 14246.4034606481, 14246.4062384259, 14246.4069328704, 14246.4069328704, 14246.4097106481, 14246.4097106481, 14246.4104050926, 14246.4117939815, 14246.4117939815, 14246.4117939815, 14246.4145717593, 14246.4152546296, 14246.4152662037, 14246.4152662037, 14246.4159606481), format = structure(c("m/d/y", "h:m:s"), .Names = c("dates", "times")), origin = structure(c(1, 1, 1970), .Names = c("month", "day", "year")), class = c("chron", "dates", "times")) finish &lt;- structure(c(14246.436099537, 14246.4666550926, 14246.4083217593, 14246.4374884259, 14246.4847106481, 14246.4867939815, 14246.4305439815, 14246.4659606481, 14246.4520717593, 14246.9097106481, 14246.4930439815, 14246.4763773148, 14246.4326273148, 14246.4291550926, 14246.4187384259, 14246.9145717593, 14246.4395601852, 14246.4395717593, 14246.4395717593, 14246.4367939815), format = structure(c("m/d/y", "h:m:s"), .Names = c("dates", "times")), origin = structure(c(1, 1, 1970), .Names = c("month", "day", "year")), class = c("chron", "dates", "times")) events &lt;- data.frame(start, finish) head(event, 5) start finish 1 (01/02/09 09:07:59) (01/02/09 10:27:59) 2 (01/02/09 09:21:59) (01/02/09 11:11:59) 3 (01/02/09 09:22:59) (01/02/09 09:47:59) 4 (01/02/09 09:34:59) (01/02/09 10:29:59) 5 (01/02/09 09:37:59) (01/02/09 11:37:59) </code></pre> <p>I now wish to count how many events are underway at specific timestamps.</p> <pre><code>intervals &lt;- structure(c(14246.3958333333, 14246.40625, 14246.4166666667, 14246.4270833333, 14246.4375), format = structure(c("m/d/y", "h:m:s"), .Names = c("dates", "times")), origin = structure(c(1, 1, 1970), .Names = c("month", "day", "year")), class = c("chron", "dates", "times")) intervals [1] (01/02/09 09:30:00) (01/02/09 09:45:00) (01/02/09 10:00:00) (01/02/09 10:15:00) (01/02/09 10:30:00) </code></pre> <p>So the output I desire is as follows:</p> <pre><code> intervals count 1 (01/01/09 09:30:00) 3 2 (01/01/09 09:45:00) 7 3 (01/01/09 10:00:00) 19 4 (01/01/09 10:15:00) 18 5 (01/01/09 10:30:00) 12 </code></pre> <p>While the problem is trivial to solve programatically, I wish to accomplish this for 210,000 intervals and over 1.2 million events. My current approach involves leveraging the <code>data.table</code> package and the <code>&amp;</code> operator to check whether an interval lies between the start and end time of each event.</p> <pre><code>library(data.table) events &lt;- data.table(events) data.frame(intervals, count = sapply(1:5, function(i) sum(events[, start &lt;= intervals[i] &amp; intervals[i] &lt;= finish]))) </code></pre> <p>But considering the size of my data, this approach takes a very long time to run. Any advice on better alternatives to accomplish this in R?</p> <p>Cheers.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload