Note that there are some explanatory texts on larger screens.

plurals
  1. POTransform a data set to have one row by time-interval
    primarykey
    data
    text
    <p><em>Example</em></p> <p>Here is some data for individual with <code>id = 1</code>:</p> <pre><code>id time status -------------- 1 t status </code></pre> <p><code>t</code> is the time to some event, and <code>status</code> is either <code>1</code> if then event occurred or <code>0</code> if it did not occurred (in which case <code>t</code> is the duration of the study). </p> <p>Say that <code>t</code> lies between <code>a2</code> and <code>a3</code>.</p> <p>My goal is to transform my data into the following:</p> <pre><code>id period start stop status --------------------------- 1 1 0 a1 0 1 2 a1 a2 0 1 3 a2 t status </code></pre> <p>The total time of individual 1 is divided into three intervals where there is no event in <code>(0, a1)</code> and <code>(a1, a2)</code></p> <p><em>Question</em></p> <p>Can you think of an efficient way to write an R-function that inputs a data set and a vector <code>a=(a1, a2, ..., aK)</code> and that outputs the transformed data set?</p> <hr> <p><strong>EDIT</strong></p> <p><strong>Part 1</strong> I have been asked a concrete example. Here is one:</p> <pre><code> id time status -------------- 1 5 1 </code></pre> <p>and <code>a1=1</code>, <code>a2=3</code>, <code>a3=7</code>.</p> <p><strong>Part 2</strong> I have also been asked to show my attempt. Here it is</p> <pre><code>&gt; data &lt;- data.frame(id=1, time=5, status=1) &gt; a &lt;- c(1, 3, 7) &gt; N &lt;- nrow(data) &gt; data$period &lt;- ifelse(data$time &lt; a[1], 1, + ifelse(data$time &lt; a[2], 2, + ifelse(data$time &lt; a[3], 3, 4))) &gt; &gt; &gt; dataTemp1 &lt;- data.frame(matrix(nrow=N, ncol=ncol(data))) &gt; names(dataTemp1) &lt;- names(data) &gt; dataTemp2 &lt;- data.frame(matrix(nrow=N, ncol=ncol(data))) &gt; names(dataTemp2) &lt;- names(data) &gt; dataTemp3 &lt;- data.frame(matrix(nrow=N, ncol=ncol(data))) &gt; names(dataTemp3) &lt;- names(data) &gt; dataTemp4 &lt;- data.frame(matrix(nrow=N, ncol=ncol(data))) &gt; names(dataTemp4) &lt;- names(data) &gt; &gt; for(j in 1:N) + { + if(data[j, "period"] == 1){ + data[j, "start"] &lt;- 0 + data[j, "stop"] &lt;- data[j, "time"] + } else if(data[j, "period"] == 2){ + dataTemp1[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp1[j, "start"] &lt;- 0 + dataTemp1[j, "stop"] &lt;- a[1] + dataTemp1[j, "status"] &lt;- 0 + + data[j, "start"] &lt;- a[1] + data[j, "stop"] &lt;- data[j, "time"] + } else if(data[j, "period"] == 3){ + dataTemp1[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp1[j, "start"] &lt;- 0 + dataTemp1[j, "stop"] &lt;- a[1] + dataTemp1[j, "status"] &lt;- 0 + + dataTemp2[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp2[j, "start"] &lt;- a[1] + dataTemp2[j, "stop"] &lt;- a[2] + dataTemp2[j, "status"] &lt;- 0 + + data[j, "start"] &lt;- a[2] + data[j, "stop"] &lt;- data[j, "time"] + } else if(data[j, "period"] == 4){ + dataTemp1[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp1[j, "start"] &lt;- 0 + dataTemp1[j, "stop"] &lt;- a[1] + dataTemp1[j, "status"] &lt;- 0 + + dataTemp2[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp2[j, "start"] &lt;- a[1] + dataTemp2[j, "stop"] &lt;- a[2] + dataTemp2[j, "status"] &lt;- 0 + + dataTemp3[j, c("id", "time", "period")] &lt;- + data[j, c("id", "time", "period")] + dataTemp3[j, "start"] &lt;- a[2] + dataTemp3[j, "stop"] &lt;- a[3] + dataTemp3[j, "status"] &lt;- 0 + + data[j, "start"] &lt;- a[3] + data[j, "stop"] &lt;- data[j, "time"] + } + } &gt; &gt; dataTemp1 &lt;- dataTemp1[complete.cases(dataTemp1), ] &gt; dataTemp2 &lt;- dataTemp2[complete.cases(dataTemp2), ] &gt; dataTemp3 &lt;- dataTemp3[complete.cases(dataTemp3), ] &gt; dataTemp4 &lt;- dataTemp4[complete.cases(dataTemp4), ] &gt; &gt; data &lt;- rbind(data, dataTemp1, dataTemp2, dataTemp3, dataTemp4) &gt; data[, "period"] &lt;- ifelse(data[, "start"] == 0, 1, + ifelse(data[, "start"] == a[1], 2, + ifelse(data[, "start"] == a[2], 3, + ifelse(data[, "start"] == a[3], 4, + 5)))) &gt; data &lt;- data[order(data$id, data$start), + c("id", "period", "start", "stop", "status")] &gt; data id period start stop status 2 1 1 0 1 0 3 1 2 1 3 0 1 1 3 3 5 1 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload