Note that there are some explanatory texts on larger screens.

plurals
  1. POStuck with Regular Expression code to apply HTML tag to text but exclude if inside <?> tag
    text
    copied!<blockquote> <p><strong>Possible Duplicate:</strong><br> <a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">RegEx match open tags except XHTML self-contained tags</a> </p> </blockquote> <p>I'm trying to write a bit of regex which would go through some text, written by our Editors, and apply an <code>&lt;acronym&gt;</code> tag to the first instance it finds of an abbreviation set we hold in our "Glossary of Terms". </p> <p>So for this example I've used the abbreviation <code>ITS</code>.</p> <p>1st thing I thought I'd do is setup an example with a mix of scenerios I could test against, i.e. <code>ITS</code> sitting with punctuation, in HTML tags &amp; ones that we've applied this to already (in other words the script has run through this before, so no need to do again).</p> <p>I'm almost there but just got stuck at the last point :-(.</p> <p>Here's the regex I've got so far - <code>&lt;[^&lt;|]+?&gt;?&gt;ITS&lt;[^&lt;]+?&gt;|ITS</code></p> <p>The Example - FROM ( EVERY <em>ITS</em> IN <strong>BOLD</strong> TO BE WRAPPED WITH ACRONYM ):</p> <blockquote> <p><code>I want you to tag this</code><strong>ITS</strong><code>, but not this wrapped one - &lt;acronym title="ITS" id="thisIsATest"&gt;ITS&lt;/acronym&gt;</code></p> <p>This is another test as I still want to update <code>&lt;p&gt;</code><strong>ITS</strong><code>&lt;/p&gt;</code> that have other HTML tags wrapped around them.`</p> <p><strong>ITS</strong> want ones that start sentences and ones that finish <strong>ITS</strong>. <strong>ITS</strong>, and ones which are wrapped in punctuation.`</p> <p><code>Test link:</code> <code>&lt;a href="index.cfm&gt;ITS&lt;/a&gt;</code></p> </blockquote> <hr> <p>AND I WANT THIS CHANGE TO :</p> <blockquote> <p><code>I want you to tag this &lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt;</code>, but not this wrapped one - <code>&lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt;</code></p> <p><code>This is another test as I still want to update &lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt;</code> that have other HTML tags wrapped around them.`</p> <p><code>&lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt; want ones that start sentences and ones that finish &lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt;. &lt;acronym title="ITS"&gt;ITS&lt;/acronym&gt;, and ones which are wrapped in punctuation.</code></p> <p><code>Test link:</code> <code>&lt;acronym title="ITS"&gt;&lt;a href="index.cfm&gt;ITS&lt;/a&gt;&lt;/acronym&gt;</code></p> </blockquote> <hr> <p>Are there any Reg Ex experts out there that could help me finish this off? Any other hints tips would also be appreciated.</p> <p>** UPDATE ** Don't know if this helps but this would find the only in that paragraph :</p> <p><code>&lt;acronym[^&lt;]*ITS&lt;/acronym&gt;</code></p> <p>and this will find all the ITS :</p> <p><code>&lt;[^&lt;]*&gt;ITS&lt;[^&lt;]*&gt;|ITS</code></p> <p>What I really need is a way of combining these to say find all the ITSs but exclude those in tags.</p> <p>Thanks a lot, James</p> <p>P.S. This is going to be placed in a ColdFusion application if that helps anyone in specific syntax.</p> <hr> <p>Here's the HTML I'm trying to parse:</p> <p><a href="http://pastebin.com/5k32aG8i" rel="nofollow noreferrer">http://pastebin.com/5k32aG8i</a></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload