Note that there are some explanatory texts on larger screens.

plurals
  1. POIs it possible to shuffle a 2D matrix while preserving row AND column frequencies?
    text
    copied!<p>Suppose I have a 2D array like the following:</p> <pre><code>GACTG AGATA TCCGA </code></pre> <p>Each array element is taken from a small finite set (in my case, DNA nucleotides -- <code>{A, C, G, T}</code>). I would like to randomly shuffle this array somehow while preserving both row <em>and</em> column nucleotide frequencies. Is this possible? Can it be done efficiently?</p> <p><strong>[EDIT]</strong>: By this I mean I want to produce a new matrix where each row has the same number of <code>A</code>s, <code>C</code>s, <code>G</code>s and <code>T</code>s as the corresponding row of the original matrix, and where each column has the same number of <code>A</code>s, <code>C</code>s, <code>G</code>s and <code>T</code>s as the corresponding column of the original matrix. <strong>Permuting the rows or columns of the original matrix will not achieve this in general.</strong> (E.g. for the example above, the top row has 2 <code>G</code>s, and 1 each of <code>A</code>, <code>C</code> and <code>T</code>; if this row was swapped with row 2, the top row in the resulting matrix would have 3 <code>A</code>s, 1 <code>G</code> and 1 <code>T</code>.)</p> <p>It's simple enough to preserve just column frequencies by shuffling a column at a time, and likewise for rows. But doing this will in general alter the frequencies of the other kind.</p> <p><strong>My thoughts so far:</strong> If it's possible to pick 2 rows and 2 columns so that the 4 elements at the corners of this rectangle have the pattern</p> <pre><code>XY YX </code></pre> <p>for some pair of distinct elements <code>X</code> and <code>Y</code>, then replacing these 4 elements with</p> <pre><code>YX XY </code></pre> <p>will maintain both row and column frequencies. In the example at the top, this can be done for (at least) rows 1 and 2 and columns 2 and 5 (whose corners give the 2x2 matrix <code>AG;GA</code>), and for rows 1 and 3 and columns 1 and 4 (whose corners give <code>GT;TG</code>). Clearly this could be repeated a number of times to produce some level of randomisation.</p> <p>Generalising a bit, any "subrectangle" induced by a subset of rows and a subset of columns, in which the frequencies of all rows are the same and the frequencies of all columns are the same, can have both its rows and columns permuted to produce a valid complete rectangle. (Of these, only those subrectangles in which at least 1 element is changed are actually interesting.) Big questions:</p> <ol> <li><strong>Are all valid complete matrices reachable by a series of such "subrectangle rearrangements"?</strong> I suspect the answer is yes.</li> <li><strong>Are all valid subrectangle rearrangements decomposable into a series of 2x2 swaps?</strong> <strong>[EDIT]</strong>: <a href="https://stackoverflow.com/questions/5086872/is-it-possible-to-shuffle-a-2d-matrix-while-preserving-row-and-column-frequencies/5087786#5087786">mhum's counterexample</a> shows that the answer is <em>no</em>. Unfortunate, because this would seem to make it harder to come up with an efficient algorithm, but important to know.</li> <li><strong>Can some or all of the valid rearrangements be computed efficiently?</strong></li> </ol> <p><a href="https://stackoverflow.com/questions/2133268/randomize-matrix-in-perl-keeping-row-and-column-totals-the-same">This question</a> addresses a special case in which the set of possible elements is <code>{0, 1}</code>. The solutions people have come up with there are similar to what I have come up with myself, and are probably usable, but not ideal as they require an arbitrary amount of backtracking to work correctly. Also I'm concerned that only 2x2 swaps are considered.</p> <p>Finally, I would ideally like a solution that can be proven to select a matrix uniformly at random from the set of all matrices having identical row frequencies and column frequencies to the original. I know, I'm asking for a lot :)</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload