Note that there are some explanatory texts on larger screens.

plurals
  1. POGreedy vs. Reluctant vs. Possessive Quantifiers
    primarykey
    data
    text
    <p>I found this <a href="http://download.oracle.com/javase/tutorial/essential/regex/quant.html" rel="noreferrer">excellent tutorial</a> on regular expressions and while I intuitively understand what "greedy", "reluctant" and "possessive" quantifiers do, there seems to be a serious hole in my understanding.</p> <p>Specifically, in the following example:</p> <pre><code>Enter your regex: .*foo // greedy quantifier Enter input string to search: xfooxxxxxxfoo I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13. Enter your regex: .*?foo // reluctant quantifier Enter input string to search: xfooxxxxxxfoo I found the text "xfoo" starting at index 0 and ending at index 4. I found the text "xxxxxxfoo" starting at index 4 and ending at index 13. Enter your regex: .*+foo // possessive quantifier Enter input string to search: xfooxxxxxxfoo No match found. </code></pre> <p>The explanation mentions <strong>eating</strong> the entire input string, letters been <strong>consumed</strong>, matcher <strong>backing off</strong>, rightmost occurrence of "foo" has been <strong>regurgitated</strong>, etc.</p> <p>Unfortunately, despite the nice metaphors, I still don't understand what is eaten by whom... Do you know of another tutorial that explains (concisely) <em>how</em> regular expressions engines work?</p> <p>Alternatively, if someone can explain in somewhat different phrasing the following paragraph, that would be much appreciated:</p> <blockquote> <p>The first example uses the greedy quantifier .* to find "anything", zero or more times, followed by the letters "f" "o" "o". Because the quantifier is greedy, the .* portion of the expression first eats the entire input string. At this point, the overall expression cannot succeed, because the last three letters ("f" "o" "o") have already been consumed (<strong>by whom?</strong>). So the matcher slowly backs off (<strong>from right-to-left?</strong>) one letter at a time until the rightmost occurrence of "foo" has been regurgitated (<strong>what does this mean?</strong>), at which point the match succeeds and the search ends.</p> <p>The second example, however, is reluctant, so it starts by first consuming (<strong>by whom?</strong>) "nothing". Because "foo" doesn't appear at the beginning of the string, it's forced to swallow (<strong>who</strong> swallows?) the first letter (an "x"), which triggers the first match at 0 and 4. Our test harness continues the process until the input string is exhausted. It finds another match at 4 and 13.</p> <p>The third example fails to find a match because the quantifier is possessive. In this case, the entire input string is consumed by .*+, (<strong>how?</strong>) leaving nothing left over to satisfy the "foo" at the end of the expression. Use a possessive quantifier for situations where you want to seize all of something without ever backing off (<strong>what does back off mean?</strong>); it will outperform the equivalent greedy quantifier in cases where the match is not immediately found.</p> </blockquote>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload