Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to find a open reading frame in Python
    primarykey
    data
    text
    <p>I am using Python and a regular expression to find an <code>ORF</code> (open reading frame).</p> <p>Find a sub-string a string that is composed ONLY of the letters <code>ATGC</code> (no spaces or new lines) that:</p> <p>Starts with <code>ATG</code>, ends with <code>TAG</code> or <code>TAA</code> or <code>TGA</code> and should consider the sequence from the first character, then second and then third:</p> <pre><code>Seq= "CCTCAGCGAGGACAGCAAGGGACTAGCCAGGAGGGAGAACAGAAACTCCAGAACATCTTGGAAATAGCTCCCAGAAAAGC AAGCAGCCAACCAGGCAGGTTCTGTCCCTTTCACTCACTGGCCCAAGGCGCCACATCTCCCTCCAGAAAAGACACCATGA GCACAGAAAGCATGATCCGCGACGTGGAACTGGCAGAAGAGGCACTCCCCCAAAAGATGGGGGGCTTCCAGAACTCCAGG CGGTGCCTATGTCTCAGCCTCTTCTCATTCCTGCTTGTGGCAGGGGCCACCACGCTCTTCTGTCTACTGAACTTCGGGGT GATCGGTCCCCAAAGGGATGAGAAGTTCCCAAATGGCCTCCCTCTCATCAGTTCTATGGCCCAGACCCTCACACTCAGAT CATCTTCTCAAAATTCGAGTGACAAGCCTGTAGCCCACGTCGTAGCAAACCACCAAGTGGAGGAGCAGCTGGAGTGGCTG AGCCAGCGCGCCAACGCCCTCCTGGCCAACGGCATGGATCTCAAAGACAACCAACTAGTGGTGCCAGCCGATGGGTTGTA CCTTGTCTACTCCCAGGTTCTCTTCAAGGGACAAGGCTGCCCCGACTACGTGCTCCTCACCCACACCGTCAGCCGATTTG CTATCTCATACCAGGAGAAAGTCAACCTCCTCTCTGCCGTCAAGAGCCCCTGCCCCAAGGACACCCCTGAGGGGGCTGAG CTCAAACCCTGGTATGAGCCCATATACCTGGGAGGAGTCTTCCAGCTGGAGAAGGGGGACCAACTCAGCGCTGAGGTCAA TCTGCCCAAGTACTTAGACTTTGCGGAGTCCGGGCAGGTCTACTTTGGAGTCATTGCTCTGTGAAGGGAATGGGTGTTCA TCCATTCTCTACCCAGCCCCCACTCTGACCCCTTTACTCTGACCCCTTTATTGTCTACTCCTCAGAGCCCCCAGTCTGTA TCCTTCTAACTTAGAAAGGGGATTATGGCTCAGGGTCCAACTCTGTGCTCAGAGCTTTCAACAACTACTCAGAAACACAA GATGCTGGGACAGTGACCTGGACTGTGGGCCTCTCATGCACCACCATCAAGGACTCAAATGGGCTTTCCGAATTCACTGG AGCCTCGAATGTCCATTCCTGAGTTCTGCAAAGGGAGAGTGGTCAGGTTGCCTCTGTCTCAGAATGAGGCTGGATAAGAT CTCAGGCCTTCCTACCTTCAGACCTTTCCAGATTCTTCCCTGAGGTGCAATGCACAGCCTTCCTCACAGAGCCAGCCCCC CTCTATTTATATTTGCACTTATTATTTATTATTTATTTATTATTTATTTATTTGCTTATGAATGTATTTATTTGGAAGGC CGGGGTGTCCTGGAGGACCCAGTGTGGGAAGCTGTCTTCAGACAGACATGTTTTCTGTGAAAACGGAGCTGAGCTGTCCC CACCTGGCCTCTCTACCTTGTTGCCTCCTCTTTTGCTTATGTTTAAAACAAAATATTTATCTAACCCAATTGTCTTAATA ACGCTGATTTGGTGACCAGGCTGTCGCTACATCACTGAACCTCTGCTCCCCACGGGAGCCGTGACTGTAATCGCCCTACG GGTCATTGAGAGAAATAA" </code></pre> <p>What I have tried:</p> <pre><code># finding the stop codon here def stop_codon(seq_0): for i in range(0,len(seq_0),3): if (seq_0[i:i+3]== "TAA" and i%3==0) or (seq_0[i:i+3]== "TAG" and i%3==0) or (seq_0[i:i+3]== "TGA" and i%3==0) : a =i+3 break else: a = None # finding the start codon here startcodon_find =[m.start() for m in re.finditer('ATG', seq_0)] </code></pre> <p>How can I find a way to check the start codon and then find the first stop codon. Subsequently find the next start codon and the next stop codon.</p> <p>I wish to run this for three frames. As mentioned earlier the three frames would be considering the first, second and third characters of the sequence as the start.</p> <p>Also the sequence needs to be divided into small parts of 3. There for it should be some thing like this:</p> <pre><code>ATG TTT AAA ACA AAA TAT TTA TCT AAC CCA ATT GTC TTA ATA ACG CTG ATT TGA </code></pre> <p>Any help will be appreciated.</p> <p>My final answer :</p> <pre><code>def orf_find(st0): seq_0="" for i in range(0,len(st0),3): if len(st0[i:i+3])==3: seq_0 = seq_0 + st0[i:i+3]+ " " ms_1 =[m.start() for m in re.finditer('ATG', seq_0)] ms_2 =[m.start() for m in re.finditer('(TAA)|(TAG)|(TGA)', seq_0)] def get_next(arr,value): for a in arr: if a &gt; value: return a return -1 codons = [] start_codon=ms_1[0] while (True): stop_codon = get_next(ms_2,start_codon) if stop_codon == -1: break codons.append((start_codon,stop_codon)) start_codon = get_next(ms_1,stop_codon) if start_codon==-1: break max_val = 0 selected_tupple = () for i in codons: k=i[1]-i[0] if k &gt; max_val: max_val = k selected_tupple = i print "selected tupple is ", selected_tupple final_seq=seq_0[selected_tupple[0]:selected_tupple[1]+3] print final_seq print "The longest orf length is " + str(max_val) output_file = open('Longorf.txt','w') output_file.write(str(orf_find(st0))) output_file.close() </code></pre> <p>The above write function does not help me in writing the content on to a text file . All i get in there is NONE.. Why this error .. Can anybody Help ?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload