Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If all you want to do is replace the column names with anonymous labels and likewise for levels of factors, yes. First some dummy data to use as the example</p> <pre><code>dat &lt;- data.frame(top_secret1 = rnorm(10), top_secret2 = runif(10), top_secret3 = factor(sample(3, 10, replace = TRUE), labels = paste("Person", 1:3, sep = ""))) </code></pre> <p>To replace the column names do:</p> <pre><code>dat2 &lt;- dat colnames(dat2) &lt;- paste("Variable", seq_len(ncol(dat2)), sep = "") </code></pre> <p>Which gives</p> <pre><code>&gt; head(dat2) Variable1 Variable2 Variable3 1 -0.4858656 0.4846700 Person3 2 0.2660125 0.1727989 Person1 3 0.1595297 0.6413984 Person2 4 1.1952239 0.1892749 Person3 5 0.3914285 0.6235119 Person2 6 0.3752015 0.7057372 Person3 </code></pre> <p>Next change the levels</p> <pre><code>foo &lt;- function(x) { if(is.factor(x)) { levels(x) &lt;- sample(LETTERS, length(levels(x))) } x } dat3 &lt;- data.frame(lapply(dat2, foo)) </code></pre> <p>which gives</p> <pre><code>&gt; head(dat3) Variable1 Variable2 Variable3 1 -0.4858656 0.4846700 K 2 0.2660125 0.1727989 G 3 0.1595297 0.6413984 O 4 1.1952239 0.1892749 K 5 0.3914285 0.6235119 O 6 0.3752015 0.7057372 K </code></pre> <p><code>foo()</code> is just a simple wrapper to a function that passed a vector checks if it is a factor, if it is, change the levels to a vector of random letters of appropriate length, then return the vector.</p> <p>We can wrap this into a function to do all the changes requested</p> <pre><code>anonymise &lt;- function(df, colString = "Variable", rowString = "Sample") { foo &lt;- function(x) { if(is.factor(x)) { levels(x) &lt;- sample(LETTERS, length(levels(x))) } x } ## replace the variable names colnames(df) &lt;- paste(colString, seq_len(ncol(df)), sep = "") ## fudge any factor levels df &lt;- data.frame(lapply(df, foo)) ## replace rownames rownames(df) &lt;- paste(rowString, seq_len(nrow(df)), sep = "") ## return df } </code></pre> <p>In use this gives</p> <pre><code>&gt; anonymise(dat) Variable1 Variable2 Variable3 Sample1 -0.48586557 0.4846700 F Sample2 0.26601253 0.1727989 L Sample3 0.15952973 0.6413984 N Sample4 1.19522395 0.1892749 F Sample5 0.39142851 0.6235119 N Sample6 0.37520154 0.7057372 F Sample7 1.18440762 0.7355211 F Sample8 0.03605239 0.3924925 L Sample9 -0.64078219 0.4579347 N Sample10 -1.39680109 0.9047227 L </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload