StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>[<em>Original answer as of 2008-11-27 down to "Since the question"; the analysis from there on is new as of 2008-11-29.</em>]</p> <p>Quickest - not sure. This works, though it is not pretty:</p> <pre><code>#!/bin/perl -w use strict; my @mylist1; push(@mylist1,"A"); push(@mylist1,"B"); push(@mylist1,"C"); my @mylist2; push(@mylist2,"A"); push(@mylist2,"D"); push(@mylist2,"E"); sub value_in { my($value, @array) = @_; foreach my $element (@array) { return 1 if $value eq $element; } return 0; } @mylist2 = (@mylist2, grep { ! value_in($_, @mylist2) } @mylist1); print sort @mylist2, "\n"; </code></pre> <p>This avoids converting the arrays into hashes - but for large arrays, the <code>value_in</code> sub may be slow.</p> <p>Since the question was "what is the quickest method", I did some benchmarking. To my none-too-vast surprise, my method was slowest. Somewhat to my surprise, the fastest method was not from List::MoreUtils. Here's the test code and the results - using a modified version of my original proposal.</p> <pre><code>#!/bin/perl -w use strict; use List::MoreUtils qw(uniq); use Benchmark::Timer; my @mylist1; push(@mylist1,"A"); push(@mylist1,"B"); push(@mylist1,"C"); my @mylist2; push(@mylist2,"A"); push(@mylist2,"D"); push(@mylist2,"E"); sub value_in { my($value) = shift @_; return grep { $value eq $_ } @_; } my @mylist3; my @mylist4; my @mylist5; my @mylist6; my $t = Benchmark::Timer->new(skip=>1); my $iterations = 10000; for my $i (1..$iterations) { $t->start('JLv2'); @mylist3 = (@mylist2, grep { ! value_in($_, @mylist2) } @mylist1); $t->stop('JLv2'); } print $t->report('JLv2'); for my $i (1..$iterations) { $t->start('LMU'); @mylist4 = uniq( @mylist1, @mylist2 ); $t->stop('LMU'); } print $t->report('LMU'); for my $i (1..$iterations) { @mylist5 = @mylist2; $t->start('HV1'); my %k; map { $k{$_} = 1 } @mylist5; push(@mylist5, grep { !exists $k{$_} } @mylist1); $t->stop('HV1'); } print $t->report('HV1'); for my $i (1..$iterations) { $t->start('HV2'); my %k; map { $k{$_} = 1 } @mylist1; map { $k{$_} = 1 } @mylist2; @mylist6 = keys %k; $t->stop('HV2'); } print $t->report('HV2'); print sort(@mylist3), "\n"; print sort(@mylist4), "\n"; print sort(@mylist5), "\n"; print sort(@mylist6), "\n"; Black JL: perl xxx.pl 9999 trials of JLv2 (1.298s total), 129us/trial 9999 trials of LMU (968.176ms total), 96us/trial 9999 trials of HV1 (516.799ms total), 51us/trial 9999 trials of HV2 (768.073ms total), 76us/trial ABCDE ABCDE ABCDE ABCDE Black JL: </code></pre> <p>This is Perl 5.10.0 compiled for 32-bit SPARC with multiplicity on an antique Sun E450 running Solaris 10.</p> <p>I believe that the test setups are fair; they all generate their answer into a new array, separate from mylist1 and mylist2 (so mylist1 and mylist2 can be reused for the next test). The answer designated HV1 (hash values 1) has the timing start after the assignment to @mylist5, which I think is correct. However, when I did the timing with the start before the assignment, it was still quickest:</p> <pre><code>Black JL: perl xxx.pl 9999 trials of JLv2 (1.293s total), 129us/trial 9999 trials of LMU (938.504ms total), 93us/trial 9999 trials of HV1 (505.998ms total), 50us/trial 9999 trials of HV2 (756.722ms total), 75us/trial ABCDE ABCDE ABCDE ABCDE 9999 trials of HV1A (655.582ms total), 65us/trial Black JL: </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload