Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Assuming the fields are seperated by tabs, then the following strategy would work. It buffers the last line, either adding up if the other fields are equal, or printing the old data and then replacing the buffer with the current line.</p> <p>After the whole input was processed, we must not forget to print out the contents that are still in the buffer.</p> <pre><code>my $first_line = do { my $l = &lt;&gt;; chomp $l; $l }; my ($last_gene, $last_tow, $last_intron) = split /\t/, $first_line; while(&lt;&gt;) { chomp; my ($gene, $tow, $intron) = split /\t/; if ($gene eq $last_gene and $intron eq $last_intron) { $last_tow += $tow; } else { print join("\t", $last_gene, $last_tow, $last_intron), "\n"; ($last_gene, $last_tow, $last_intron) = ($gene, $tow, $intron); } } print join("\t", $last_gene, $last_tow, $last_intron), "\n"; </code></pre> <p>This works fine as long as genes that may be folded together are always consecutive. If the joinable records are spread all over the file, we have to keep a data structure of all records. After the whole file is parsed, we can emit nicely sorted sums.</p> <p>We will use a multilevel hash that uses the gene as first level key, and the intron as 2nd level key. The value is the count/tow/whatever:</p> <pre><code>my %records; # parse the file while (&lt;&gt;) { chomp; my ($gene, $tow, $intron) = split /\t/; $records{$gene}{$intron} += $tow; } # emit the data: for my $gene (sort keys %records) { for my $intron (sort keys %{ $records{$gene} }) { print join("\t", $gene, records{$gene}{$intron}, $intron), \n"; } } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload