Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If you can post the email that's causing the problem (perhaps anonymized in some way), that will give us more information, but I'm <em>thinking</em> the problem is this little guy right here:</p> <pre><code>([-.]\\w+)*\\.\\w+([-.]\\w+)* </code></pre> <p>To understand the problem, let's break that into groups:</p> <pre><code>([-.]\\w+)* \\.\\w+ ([-.]\\w+)* </code></pre> <p>The strings that will match <code>\\.\\w+</code> are a <strong>subset</strong> of those that will match <code>[-.]\\w+</code>. So if part of your input looks like <code>foo.bar.baz.blah.yadda.com</code>, your regex engine has no way of knowing which group is supposed to match it. Does that make sense? So the first <code>([-.]\\w+)*</code> could match <code>.bar.baz.blah</code>, then the <code>\\.\\w+</code> could match <code>.yadda</code>, then the last <code>([-.]\\w+)*</code> could match <code>.com</code>...</p> <p>...<strong>OR</strong> the first clause could match <code>.bar.baz</code>, the second could match <code>.blah</code>, and the last could match <code>.yadda.com</code>. Since it doesn't know which one is right, it will keep trying different combinations. It should stop eventually, but that could still take a long time. This is called "catastrophic backtracking".</p> <p>This issue is compounded by the fact that you're using <a href="http://www.regular-expressions.info/brackets.html" rel="nofollow">capturing groups</a> rather than non-capturing groups; i.e. <code>([-+.]\\w+)</code> instead of <code>(?:[-+.]\\w+)</code>. That causes the engine to try and separate and save whatever matches inside the parentheses for later reference. But as I explained above, it's ambiguous which group each substring belongs in.</p> <p>You might consider replacing everything after the @ with something like this:</p> <pre><code>\\w[-\\w]*\\.[-.\\w]+ </code></pre> <p>That could use some refinement to make it more specific, but you get the general idea. Hope I explained all this well enough; grouping and backreferences are kind of tough to describe.</p> <h3>EDIT:</h3> <p>Looking back at your pattern, there's a deeper issue here, still related to the backtracking/ambiguity problem I mentioned. The clause <code>\\w+([-.]\\w+)*</code> is ambiguous all by itself. Splitting it into parts, we have:</p> <pre><code>\\w+ ([-.]\\w+)* </code></pre> <p>Suppose you have a string like <code>foobar</code>. Where does the <code>\\w+</code> end and the <code>([-.]\\w+)*</code> begin? How many repetitions of <code>([-.]\\w+)</code> are there? Any of the following could work as matches:</p> <pre><code>f(oobar) foo(bar) f(o)(oba)(r) f(o)(o)(b)(a)(r) foobar etc... </code></pre> <p>The regex engine doesn't know which is important, so it will try them all. This is the same problem I pointed out above, but it means you have it in multiple places in your pattern. </p> <p>Even worse, <code>([-.]\\w+)*</code> is <em>also</em> ambiguous, because of the <code>+</code> after the <code>\\w</code>. How many groups are there in <code>blah</code>? I count 16 possible combinations: <code>(blah)</code>, <code>(b)(lah)</code>, <code>(bl)(ah)</code>...</p> <p>The amount of different possible combinations is going to be huge, even for a relatively small input, so your engine is going to be in overdrive. I would definitely simplify it if I were you. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload