Note that there are some explanatory texts on larger screens.

plurals
  1. PORemove highly correlated variables
    primarykey
    data
    text
    <p>I have a huge dataframe 5600 X 6592 and I want to remove any variables that are correlated to each other more than 0.99 I do know how to do this the long way, step by step i.e. forming a correlation matrix, rounding the values, removing similar ones and use the indexing to get my "reduced" data again.</p> <pre><code>cor(mydata) mydata &lt;- round(mydata,2) mydata &lt;- mydata[,!duplicated (mydata)] ## then do the indexing... </code></pre> <p>I would like to know if this could be done in short command, or some advanced function. I'm learning how to make use of the powerful tools in the R language, which avoids such long unnecessary commands</p> <p>I was thinking of something like</p> <pre><code>mydata &lt;- mydata[, which(apply(mydata, 2, function(x) !duplicated(round(cor(x),2))))] </code></pre> <p>Sorry I know the above command doesn't work, but I hope I would be able to do this.</p> <p>a play-data that applies to the question:</p> <pre><code>mydata &lt;- structure(list(V1 = c(1L, 2L, 5L, 4L, 366L, 65L, 43L, 456L, 876L, 78L, 687L, 378L, 378L, 34L, 53L, 43L), V2 = c(2L, 2L, 5L, 4L, 366L, 65L, 43L, 456L, 876L, 78L, 687L, 378L, 378L, 34L, 53L, 41L), V3 = c(10L, 20L, 10L, 20L, 10L, 20L, 1L, 0L, 1L, 2010L, 20L, 10L, 10L, 10L, 10L, 10L), V4 = c(2L, 10L, 31L, 2L, 2L, 5L, 2L, 5L, 1L, 52L, 1L, 2L, 52L, 6L, 2L, 1L), V5 = c(4L, 10L, 31L, 2L, 2L, 5L, 2L, 5L, 1L, 52L, 1L, 2L, 52L, 6L, 2L, 3L)), .Names = c("V1", "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, -16L)) </code></pre> <p>Many thanks</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload