Note that there are some explanatory texts on larger screens.

plurals
  1. POSearch for string allowing for one mismatch in any location of the string
    primarykey
    data
    text
    <p>I am working with DNA sequences of length 25 (see examples below). I have a list of 230,000 and need to look for each sequence in the entire genome (toxoplasma gondii parasite). I am not sure how large the genome is, but much longer than 230,000 sequences.</p> <p>I need to look for each of my sequences of 25 characters, for example, (AGCCTCCCATGATTGAACAGATCAT).</p> <p>The genome is formatted as a continuous string, i.e. (CATGGGAGGCTTGCGGAGCCTGAGGGCGGAGCCTGAGGTGGGAGGCTTGCGGAGTGCGGAGCCTGAGCCTGAGGGCGGAGCCTGAGGTGGGAGGCTT....)</p> <p>I don't care where or how many times it is found, only whether it is or not.<br> This is simple I think -- </p> <pre><code>str.find(AGCCTCCCATGATTGAACAGATCAT) </code></pre> <p>But I also what to find a close match defined as wrong (mismatched) at any location, but only one location, and record the location in the sequence. I am not sure how do do this. The only thing I can think of is using a wildcard and performing the search with a wildcard in each position. I.e., search 25 times.</p> <p>For example,<br> AGCCTCCCATGATTGAACAGATCAT<br> AGCCTCCCATGATAGAACAGATCAT</p> <p>A close match with a mismatch at position 13.</p> <p>Speed is not a big issue because I am only doing it 3 times, though it would be nice if it was fast.</p> <p>There are programs that do this -- find matches and partial matches -- but I am looking for a type of partial match that is not discoverable with these applications.</p> <p>Here is a similar post for perl, although they are only comparing sequences and not searching a continuous string : </p> <p><a href="https://stackoverflow.com/questions/1672782/fastest-way-to-find-mismatch-positions-between-two-strings-of-the-same-length">Related post</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload