Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I have barely started learning AWK, so I can't offer any advice on that front. However, for some python code that does what you need:</p> <pre><code>class ProteinIterator(): def __init__(self, file): self.file = open(file, 'r') self.first_line = self.file.readline() def __iter__(self): return self def __next__(self): "returns the next protein build" if not self.first_line: # reached end of file raise StopIteration file = self.file protein_data = [self.first_line] while True: line = file.readline() if line.startswith('buildProtein ') or not line: self.first_line = line break protein_data.append(line) return Protein(protein_data) class Protein(): def __init__(self, data): self._data = data for line in data: if line.startswith('buildProtein '): self.initial_compounds = tuple(line[13:].split()) elif line.startswith('Final result - '): pieces = line[15:].split()[::2] # every other piece is a name self.final_compounds = tuple([p[:-1] for p in pieces]) elif line.startswith('Other Compounds '): pieces = line[16:].split()[::2] # every other piece is a name self.other_compounds = tuple([p[:-1] for p in pieces]) def __repr__(self): return ("Protein(%s)"% self._data[0]) @property def data(self): return ''.join(self._data) </code></pre> <p>What we have here is an iterator for the buildprotein text file which returns one protein at a time as a <code>Protein</code> object. This <code>Protein</code> object is smart enough to know it's inputs, final results, and other results. You may have to modify some of the code if the actual text in the file is not exactly as represented in the question. Following is a short test of the code with example usage:</p> <pre><code>if __name__ == '__main__': test_data = """\ buildProtein compoundA compoundB begin fusion Calculate : (lots of text here on multiple lines) (more lines) Final result - H20: value CO2: value Compound: value Other Compounds X: Value Y: value Z: value""" open('testPI.txt', 'w').write(test_data) for protein in ProteinIterator('testPI.txt'): print(protein.initial_compounds) print(protein.final_compounds) print(protein.other_compounds) print() if 'CO2' in protein.final_compounds: print(protein.data) </code></pre> <p>I didn't bother saving values, but you can add that in if you like. Hopefully this will get you going.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload