Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>As you have tagged it Biopython I suppose you know of Biopython. Have you checked out the docu yet? <a href="http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc231" rel="noreferrer">http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc231</a> might help.</p> <p>I adjusted the code from the above link a bit to work on your sequence:</p> <pre><code>from Bio.Seq import Seq seq = Seq("CCTCAGCGAGGACAGCAAGGGACTAGCCAGGAGGGAGAACAGAAACTCCAGAACATCTTGGAAATAGCTCCCAGAAAAGCAAGCAGCCAACCAGGCAGGTTCTGTCCCTTTCACTCACTGGCCCAAGGCGCCACATCTCCCTCCAGAAAAGACACCATGAGCACAGAAAGCATGATCCGCGACGTGGAACTGGCAGAAGAGGCACTCCCCCAAAAGATGGGGGGCTTCCAGAACTCCAGGCGGTGCCTATGTCTCAGCCTCTTCTCATTCCTGCTTGTGGCAGGGGCCACCACGCTCTTCTGTCTACTGAACTTCGGGGTGATCGGTCCCCAAAGGGATGAGAAGTTCCCAAATGGCCTCCCTCTCATCAGTTCTATGGCCCAGACCCTCACACTCAGATCATCTTCTCAAAATTCGAGTGACAAGCCTGTAGCCCACGTCGTAGCAAACCACCAAGTGGAGGAGCAGCTGGAGTGGCTGAGCCAGCGCGCCAACGCCCTCCTGGCCAACGGCATGGATCTCAAAGACAACCAACTAGTGGTGCCAGCCGATGGGTTGTACCTTGTCTACTCCCAGGTTCTCTTCAAGGGACAAGGCTGCCCCGACTACGTGCTCCTCACCCACACCGTCAGCCGATTTGCTATCTCATACCAGGAGAAAGTCAACCTCCTCTCTGCCGTCAAGAGCCCCTGCCCCAAGGACACCCCTGAGGGGGCTGAGCTCAAACCCTGGTATGAGCCCATATACCTGGGAGGAGTCTTCCAGCTGGAGAAGGGGGACCAACTCAGCGCTGAGGTCAATCTGCCCAAGTACTTAGACTTTGCGGAGTCCGGGCAGGTCTACTTTGGAGTCATTGCTCTGTGAAGGGAATGGGTGTTCATCCATTCTCTACCCAGCCCCCACTCTGACCCCTTTACTCTGACCCCTTTATTGTCTACTCCTCAGAGCCCCCAGTCTGTATCCTTCTAACTTAGAAAGGGGATTATGGCTCAGGGTCCAACTCTGTGCTCAGAGCTTTCAACAACTACTCAGAAACACAAGATGCTGGGACAGTGACCTGGACTGTGGGCCTCTCATGCACCACCATCAAGGACTCAAATGGGCTTTCCGAATTCACTGGAGCCTCGAATGTCCATTCCTGAGTTCTGCAAAGGGAGAGTGGTCAGGTTGCCTCTGTCTCAGAATGAGGCTGGATAAGATCTCAGGCCTTCCTACCTTCAGACCTTTCCAGATTCTTCCCTGAGGTGCAATGCACAGCCTTCCTCACAGAGCCAGCCCCCCTCTATTTATATTTGCACTTATTATTTATTATTTATTTATTATTTATTTATTTGCTTATGAATGTATTTATTTGGAAGGCCGGGGTGTCCTGGAGGACCCAGTGTGGGAAGCTGTCTTCAGACAGACATGTTTTCTGTGAAAACGGAGCTGAGCTGTCCCCACCTGGCCTCTCTACCTTGTTGCCTCCTCTTTTGCTTATGTTTAAAACAAAATATTTATCTAACCCAATTGTCTTAATAACGCTGATTTGGTGACCAGGCTGTCGCTACATCACTGAACCTCTGCTCCCCACGGGAGCCGTGACTGTAATCGCCCTACGGGTCATTGAGAGAAATAA") table = 1 min_pro_len = 100 for strand, nuc in [(+1, seq), (-1, seq.reverse_complement())]: for frame in range(3): for pro in nuc[frame:].translate(table).split("*"): if len(pro) &gt;= min_pro_len: print "%s...%s - length %i, strand %i, frame %i" % (pro[:30], pro[-3:], len(pro), strand, frame) </code></pre> <p>The ORF is also translated. You can choose a different translation table. Check out <a href="http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec:translation" rel="noreferrer">http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec:translation</a></p> <p><strong>EDIT:</strong> Explanation of the code:</p> <p>Right at the top I create a sequence object out of your string. Notice the <code>seq = Seq("ACGT")</code>. The two for-loops create the 6 different frames. The inner for-loop translates each frame according to the chosen translation table and returns an amino acid chain where each stop codon is coded as <code>*</code>. The <code>split</code> function splits this string removing these placeholders resulting in a list of possible protein sequences. By setting min_pro_len you can define the minimum amino acid chain length for a protein to be detected. 1 is the standard table. Check out <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1" rel="noreferrer">http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1</a> Here you see that the initiation codon is <code>AUG</code> (equals <code>ATG</code>) and the end codons (* in the nucleotide sequence) are <code>TAA</code>, <code>TAG</code>, and <code>TGA</code>, just like you wanted. You could also use a different translation table.</p> <p>When you add</p> <pre><code>print nuc[frame:].translate(table) </code></pre> <p>right inside the second for-loop you get something like:</p> <pre><code>PQRGQQGTSQEGEQKLQNILEIAPRKASSQPGRFCPFHSLAQGATSPSRKDTMSTESMIRDVELAEEALPQKMGGFQNSRRCLCLSLFSFLLVAGATTLFCLLNFGVIGPQRDEKFPNGLPLISSMAQTLTLRSSSQNSSDKPVAHVVANHQVEEQLEWLSQRANALLANGMDLKDNQLVVPADGLYLVYSQVLFKGQGCPDYVLLTHTVSRFAISYQEKVNLLSAVKSPCPKDTPEGAELKPWYEPIYLGGVFQLEKGDQLSAEVNLPKYLDFAESGQVYFGVIAL*REWVFIHSLPSPHSDPFTLTPLLSTPQSPQSVSF*LRKGIMAQGPTLCSELSTTTQKHKMLGQ*PGLWASHAPPSRTQMGFPNSLEPRMSIPEFCKGRVVRLPLSQNEAG*DLRPSYLQTFPDSSLRCNAQPSSQSQPPSIYICTYYLLFIYYLFICL*MYLFGRPGCPGGPSVGSCLQTDMFSVKTELSCPHLASLPCCLLFCLCLKQNIYLTQLS**R*FGDQAVATSLNLCSPREP*L*SPYGSLREI </code></pre> <p>(notice the asterisks are at the stop codon positions)</p> <p><strong>EDIT: Answer to your second question:</strong></p> <p>You must return a string that you want write into a file. Create an Output string and return it at the end of the function:</p> <pre><code>output = "selected tupple is " + str(selected_tupple) + "\n" output += final_seq + "\n" output += "The longest orf length is " + str(max_val) + "\n" return output </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload