Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>First, we read in the second file and put the values into an <em>array</em>. I further assume that this <code>chr1</code> is constant and can be discarded safely:</p> <pre><code>#!/usr/bin/perl use strict; use warnings; my @file2; open my $fh2, "&lt;", "file2" or die $!; while (&lt;$fh2&gt;) { my (undef, $num) = split; die "the number contains illegal characters" if $num =~ /\D/; push @file2, $num; } @file2 = sort {$a &lt;=&gt; $b} @file2; # sort ascending # remove previous line if sorting is already guaranteed. </code></pre> <p>Then, we define a sub to find the two values in our array. It is just a variation of a basic algorithm to find a certain value in a sorted list (in <em>O(log n)</em>), and should perform better than iterating over each value, at least on large sets. Also, it doesn't require reversing the whole list for each value.</p> <pre><code>sub find { my ($num, $arrayref) = @_; # exit if array is too small return unless @$arrayref &gt;= 2; # exit if $num is outside the values of this array (-1 is last element) return if $num &lt;= $arrayref-&gt;[0] or $arrayref-&gt;[-1] &lt; $num; my ($lo, $hi) = (1, $#$arrayref); my $i = int(($lo+$hi)/2); # start in the middle # iterate until # a) the previous index contains a number that is smaller than $num and # b) the current index contains a number that is greater or equal to $num. until($arrayref-&gt;[$i-1] &lt; $num and $num &lt;= $arrayref-&gt;[$i]) { # make $i the next lower or upper bound. # instead of going into an infinite loop (which would happen if we # assign $i to a variable that already holds the same value), we discard # the value and move on towards the middle. # $i is too small if ($num &gt; $arrayref-&gt;[$i] ) { $lo = ($lo == $i ? $i+1 : $i) } # $i is too large elsif ($num &lt;= $arrayref-&gt;[$i-1]) { $hi = ($hi == $i ? $i-1 : $i) } # in case I made an error: else { die "illegal state" } # calculate the next index $i = int(($lo+$hi)/2); } return @{$arrayref}[$i-1, $i]; } </code></pre> <p>The rest is trivial:</p> <pre><code>open my $fh1, "&lt;", "file1" or die $!; while (&lt;$fh1&gt;) { my ($chr, $num) = split; die "the number contains illegal characters" if $num =~ /\D/; if (my ($lo, $hi) = find($num, \@file2)) { if ($hi == $num) { print join("\t", $chr, $num, "Yes"), "\n"; } else { print join("\t", $chr, $num, "No", $hi-$num, $lo-$num), "\n"; } } else { # no matching numbers were found in file 2 print join("\t", $chr, $num, "No-match"), "\n"; } } </code></pre> <p>Output:</p> <pre><code>chr1 10227 No 790 -977 chr1 447989 No 6100 -8406 chr1 535362 No 9349 -75345 chr1 856788 Yes </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload