Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Why are you using a regex? You're looking for the position of the literal text {{ or }}. Perl has a built-in that does exactly that: <a href="http://perldoc.perl.org/functions/index.html" rel="nofollow noreferrer">index</a>.</p> <p>Since you are trying to parse Wikipedia entries, you need to handle nested template directives. This means that, for instance, the second set of closing curlies you found doesn't necessarily go with the second set of open curlies. In this bit from the <a href="http://en.wikipedia.org/w/index.php?title=Perl&amp;action=edit" rel="nofollow noreferrer">Perl</a> entry, the first closing curly goes with the second opening one:</p> <pre> {{Infobox programming language | latest_release_version = 5.10.0 | latest_release_date = {{release date|mf=yes|2007|12|18}} | turing-complete = Yes }} </pre> <p>Perl 5.10 regexes can handle this for you since they can match balanced text recursively, and there are Perl modules to do it as well. That's going to be a bit of work, though. It's difficult to give you any advice until you say what you are trying to accomplish. Surely there is a mediawiki parser out there that can do what you are trying to do.</p> <hr> <p>I was going to code up my <code>index()</code> solution, but I didn't. I can't get your code to be slow enough that it matters. Both the <code>pos()</code> and the <code>@-</code> solutions complete virtually instanteously for me, even when I do all of the stack management and print the contents of each template. I had to try really hard to make it run slow enough to be measurable, and I'm on some old hardware. You might need to tune your application in some other way.</p> <p>Are you sure that the code you are measuring is slowing down at the point you think it is? Have you profiled it with <a href="http://search.cpan.org/dist/Devel-NYTProf" rel="nofollow noreferrer">Devel::NYTProf</a> to see what your real program is doing?</p> <pre><code>#!/usr/bin/perl use strict; use warnings; use Benchmark; my $text = do { local $/; &lt;DATA&gt; }; # put the contents after __END__ my %subs = ( using_pos =&gt; sub { my $page = shift; my @stack; my $found; while( $$page =~ m/ ( \{\{ | }} ) /xg ) { if( $1 eq '{{' ) { push @stack, pos($$page) - 2; } else { my $start = pop @stack; print STDERR "\tFound at $start: ", substr( $$page, $start, pos($$page) - $start ), "\n"; $found++; }; } print " Processed $found templates =&gt; "; }, using_special =&gt; sub { my $page = shift; my @stack; my $found; while( $$page =~ m/ ( \{\{ | }} ) /xg ) { if( $1 eq '{{' ) { push @stack, $-[0]; } else { my $start = pop @stack; print STDERR "\tFound at $start: ", substr( $$page, $start, $-[0] - $start ), "\n"; $found++; }; } print " Processed $found templates =&gt; "; }, ); foreach my $key ( keys %subs ) { printf "%15s =&gt; ", $key; my $t = timeit( 1, sub{ $subs{$key}-&gt;( \$text ) } ); print timestr($t), "\n"; } </code></pre> <p>My perl on my 17" MacBook Pro:</p> <pre> macbookpro_brian[349]$ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=darwin, osvers=8.8.2, archname=darwin-2level uname='darwin macbookpro.local 8.8.2 darwin kernel version 8.8.2: thu sep 28 20:43:26 pdt 2006; root:xnu-792.14.14.obj~1release_i386 i386 i386 ' config_args='-des' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/opt/local/include', optimize='-O3', cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/opt/local/include' ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -L/usr/local/lib -L/opt/local/lib' libpth=/usr/local/lib /opt/local/lib /usr/lib libs=-ldbm -ldl -lm -lc perllibs=-ldl -lm -lc libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib' Characteristics of this binary (from libperl): Compile-time options: PERL_MALLOC_WRAP USE_LARGE_FILES USE_PERLIO Built under darwin Compiled at Apr 9 2007 10:36:26 @INC: /usr/local/lib/perl5/5.8.8/darwin-2level /usr/local/lib/perl5/5.8.8 /usr/local/lib/perl5/site_perl/5.8.8/darwin-2level /usr/local/lib/perl5/site_perl/5.8.8 /usr/local/lib/perl5/site_perl </pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload