StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POMake nested loops more efficient?
text
Body
copied!<p>I'm analyzing large sets of data using the following script:</p> <pre><code>M <- c_alignment c_check <- function(x){ if (x == c_1) { 1 }else{ 0 } } both_c_check <- function(x){ if (x[res_1] == c_1 && x[res_2] == c_1) { 1 }else{ 0 } } variance_function <- function(x,y){ sqrt(x*(1-x))*sqrt(y*(1-y)) } frames_total <- nrow(M) cols <- ncol(M) c_vector <- apply(M, 2, max) freq_vector <- matrix(nrow = sum(c_vector)) co_freq_matrix <- matrix(nrow = sum(c_vector), ncol = sum(c_vector)) insertion <- 0 res_1_insertion <- 0 for (res_1 in 1:cols){ for (c_1 in 1:conf_vector[res_1]){ res_1_insertion <- res_1_insertion + 1 insertion <- insertion + 1 res_1_subset <- sapply(M[,res_1], c_check) freq_vector[insertion] <- sum(res_1_subset)/frames_total res_2_insertion <- 0 for (res_2 in 1:cols){ if (is.na(co_freq_matrix[res_1_insertion, res_2_insertion + 1])){ for (c_2 in 1:max(c_vector[res_2])){ res_2_insertion <- res_2_insertion + 1 both_res_subset <- apply(M, 1, both_c_check) co_freq_matrix[res_1_insertion, res_2_insertion] <- sum(both_res_subset)/frames_total co_freq_matrix[res_2_insertion, res_1_insertion] <- sum(both_res_subset)/frames_total } } } } } covariance_matrix <- (co_freq_matrix - crossprod(t(freq_vector))) variance_matrix <- matrix(outer(freq_vector, freq_vector, variance_function), ncol = length(freq_vector)) correlation_coefficient_matrix <- covariance_matrix/variance_matrix </code></pre> <p>A model input would be something like this:</p> <pre><code>1 2 1 4 3 1 3 4 2 1 2 3 3 3 1 1 1 2 1 2 2 3 4 4 2 </code></pre> <p>What I'm calculating is the binomial covariance for each state found in <code>M[,i]</code> with each state found in <code>M[,j]</code>. Each row is the state found for that trial, and I want to see how the state of the columns co-vary. </p> <p>Clarification: I'm finding the covariance of two multinomial distributions, but I'm doing it through binomial comparisons.</p> <p>The input is a 4200 x 510 matrix, and the c value for each column is about 15 on average. I know <code>for</code> loops are terribly slow in R, but I'm not sure how I can use the <code>apply</code> function here. If anyone has a suggestion as to properly using <code>apply</code> here, I'd really appreciate it. Right now the script takes several hours. Thanks!</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload