Note that there are some explanatory texts on larger screens.

plurals
  1. POLooking for elegant glob-like DNA string expansion
    text
    copied!<p>I'm trying to make a glob-like expansion of a set of DNA strings that have multiple possible bases.</p> <p>The base of my DNA strings contains the letters A, C, G, and T. However, I can have special characters like M which could be an A or a C.</p> <p>For example, say I have the string:</p> <p><code>ATMM</code></p> <p>I would like to take this string as input and output the four possible matching strings:</p> <p><code>ATAA</code> <code>ATAC</code> <code>ATCA</code> <code>ATCC</code></p> <p>Rather than brute force a solution, I feel like there must be some elegant Python/Perl/Regular Expression trick to do this. </p> <p>Thank you for any advice.</p> <p><strong>Edit, thanks cortex for the product operator. This is my solution:</strong></p> <p>Still a Python newbie, so I bet there's a better way to handle each dictionary key than another for loop. Any suggestions would be great.</p> <pre><code>import sys from itertools import product baseDict = dict(M=['A','C'],R=['A','G'],W=['A','T'],S=['C','G'], Y=['C','T'],K=['G','T'],V=['A','C','G'], H=['A','C','T'],D=['A','G','T'],B=['C','G','T']) def glob(str): strings = [str] ## this loop visits very possible base in the dictionary ## probably a cleaner way to do it for base in baseDict: oldstrings = strings strings = [] for string in oldstrings: strings += map("".join,product(*[baseDict[base] if x == base else [x] for x in string])) return strings for line in sys.stdin.readlines(): line = line.rstrip('\n') permutations = glob(line) for x in permutations: print x </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload