Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>For numerical data you have the solution. But it is categorical data, you said. Then life gets a bit more complicated...</p> <p>Well, first : The amount of association between two categorical variables is not measured with a Spearman rank correlation, but with a Chi-square test for example. Which is logic actually. Ranking means there is some order in your data. Now tell me which is larger, yellow or red? I know, sometimes R does perform a spearman rank correlation on categorical data. If I code yellow 1 and red 2, R would consider red larger than yellow.</p> <p>So, forget about Spearman for categorical data. I'll demonstrate the chisq-test and how to choose columns using combn(). But you would benefit from a bit more time with Agresti's book : <a href="http://rads.stackoverflow.com/amzn/click/0471360937" rel="noreferrer">http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937</a></p> <pre><code>set.seed(1234) X &lt;- rep(c("A","B"),20) Y &lt;- sample(c("C","D"),40,replace=T) table(X,Y) chisq.test(table(X,Y),correct=F) # I don't use Yates continuity correction #Let's make a matrix with tons of columns Data &lt;- as.data.frame( matrix( sample(letters[1:3],2000,replace=T), ncol=25 ) ) # You want to select which columns to use columns &lt;- c(3,7,11,24) vars &lt;- names(Data)[columns] # say you need to know which ones are associated with each other. out &lt;- apply( combn(columns,2),2,function(x){ chisq.test(table(Data[,x[1]],Data[,x[2]]),correct=F)$p.value }) out &lt;- cbind(as.data.frame(t(combn(vars,2))),out) </code></pre> <p>Then you should get :</p> <pre><code>&gt; out V1 V2 out 1 V3 V7 0.8116733 2 V3 V11 0.1096903 3 V3 V24 0.1653670 4 V7 V11 0.3629871 5 V7 V24 0.4947797 6 V11 V24 0.7259321 </code></pre> <p>Where V1 and V2 indicate between which variables it goes, and "out" gives the p-value for association. Here all variables are independent. Which you would expect, as I created the data at random.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload