Note that there are some explanatory texts on larger screens.

plurals
  1. POExtracting email addresses in an html block in ruby/rails
    text
    copied!<p>I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it)</p> <p>I've tried regexes and so far this has been successful:</p> <pre><code>/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i </code></pre> <p>problem is, i need to ignore all email addresses with mailto hrefs. for example:</p> <pre><code>&lt;a href="mailto:test@mail.com"&gt;test@mail.com&lt;/a&gt; </code></pre> <p>should only return the second email add. </p> <p>To get a background of what im doing, im reversing the email addresses in a block so the above example would look like this:</p> <pre><code>&lt;a href="mailto:test@mail.com"&gt;moc.liam@tset&lt;/a&gt; </code></pre> <p>problem with my current regex is that it also replaces the one in href. Is there a way for me to do this with a single regex? Or do i have to check for one then the other? Is there a way for me to do this just by using gsub or do I have to use some nokogiri/hpricot magicks and whatnot to parse the mailtos? Thanks in advance!</p> <p>Here were my references btw:</p> <p>so.com/questions/504860/extract-email-addresses-from-a-block-of-text</p> <p>so.com/questions/1376149/regexp-for-extracting-a-mailto-address</p> <p>im also testing using this:</p> <p><a href="http://rubular.com/" rel="nofollow noreferrer">http://rubular.com/</a></p> <p><em>edit</em></p> <p>here's my current helper code:</p> <pre><code>def email_obfuscator(text) text.gsub(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i) { |m| m = "&lt;span class='anti-spam'&gt;#{m.reverse}&lt;/span&gt;" } end </code></pre> <p>which results in this:</p> <pre><code>&lt;a target="_self" href="mailto:&lt;span class='anti-spam'&gt;moc.liamg@tset&lt;/span&gt;"&gt;&lt;span class="anti-spam"&gt;moc.liamg@tset&lt;/span&gt;&lt;/a&gt; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload