Note that there are some explanatory texts on larger screens.

plurals
  1. POperl non-greedy problem
    primarykey
    data
    text
    <p>I am having a problem with a non-greedy regular expression. I've seen that there are questions regarding non-greedy regex, but they don't answer to my problem.</p> <p><strong>Problem:</strong> I am trying to match the href of the "lol" anchor. </p> <p><strong>Note:</strong> I know this can be done with perl HTML parsing modules, and my question is <strong>not</strong> about parsing HTML in perl. My question is about the regular expression itself and the HTML is just an example.</p> <p><strong>Test case:</strong> I have 4 tests for <code>.*?</code> and <code>[^"]</code>. The 2 first produce the expected result. However the 3rd doesn't and the 4th just does but I don't understand why.</p> <p><strong>Questions:</strong> </p> <ol> <li><strong>Why</strong> does the 3rd test fail in both tests for <code>.*?</code> and <code>[^"]</code> ? Shouldn't the non-greedy operator work?</li> <li><strong>Why</strong> does the 4th test works in both tests for <code>.*?</code> and <code>[^"]</code> ? I don't understand why including a <code>.*</code> in front changes the regex. (the 3rd and 4th tests are the same except the <code>.*</code> in front).</li> </ol> <p>I probably don't understand exactly how these regex work. A <a href="http://docstore.mik.ua/orelly/perl/cookbook/ch06_16.htm" rel="nofollow">perl cookbook recipe</a> mentions something but I don't think it answers my question.</p> <pre><code>use strict; my $content=&lt;&lt;EOF; &lt;a href="/hoh/hoh/hoh/hoh/hoh" class="hoh"&gt;hoh&lt;/a&gt; &lt;a href="/foo/foo/foo/foo/foo" class="foo"&gt;foo &lt;/a&gt; &lt;a href="/bar/bar/bar/bar/bar" class="bar"&gt;bar&lt;/a&gt; &lt;a href="/lol/lol/lol/lol/lol" class="lol"&gt;lol&lt;/a&gt; &lt;a href="/koo/koo/koo/koo/koo" class="koo"&gt;koo&lt;/a&gt; EOF print "| $1 | \n\nThat's ok\n" if $content =~ m~href="(.*?)"~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nThat's ok\n" if $content =~ m~href="(.*?)".*&gt;lol~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nWhy does not the 2nd non-greedy '?' work?\n" if $content =~ m~href="(.*?)".*?&gt;lol~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nIt now works if I put the '.*' in the front?\n" if $content =~ m~.*href="(.*?)".*?&gt;lol~s ; print "\n###################################################\n"; print "Let's try now with [^]"; print "\n###################################################\n\n"; print "| $1 | \n\nThat's ok\n" if $content =~ m~href="([^"]+?)"~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nThat's ok.\n" if $content =~ m~href="([^"]+?)".*&gt;lol~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nThe 2nd greedy still doesn't work?\n" if $content =~ m~href="([^"]+?)".*?&gt;lol~s ; print "\n---------------------------------------------------\n"; print "| $1 | \n\nNow with the '.*' in front it does.\n" if $content =~ m~.*href="([^"]+?)".*?&gt;lol~s ; </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload