Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This is quite a difficult problem to solve and you need determine if regular expressions will work for you and how you handle embedding (when you add a dictionary word to profanity like frackface except with the real F-word).</p> <p>Regular expressions generally have a limit to how long they can be and this usually prevents you from using a single regex for all your words. Executing multiple regular expressions against a string is really slow, depending on what performance you need and how big your blacklist gets. We initially implement <a href="http://www.inversoft.com/features/profanity-filter/" rel="nofollow">CleanSpeak</a> as a regular expression system, but it didn't scale and we rewrote it using a different mechanism.</p> <p>You also need to consider phrases, punctuation, spaces, leet-speak and other languages. All of these make regular expressions less appealing as a solution. Here are some examples using the word hello (assume it is profanity for this exercise):</p> <ul> <li>List item</li> <li>h e l l o</li> <li>h.e.l.l.o</li> <li>h_e_l_l_o</li> <li>|-|ello</li> <li>h3llo</li> <li>"hello there" (this phrase might not contain any profane words but combined they are profane)</li> </ul> <p>You also need to handle edge cases where two or more dictionary (whitelist) words contain a profanity when next to each other. Some examples that contain the s-word:</p> <ul> <li>bash it</li> <li>ssh it's quiet time</li> </ul> <p>These are obviously not profanity, but most homegrown and many commercial solutions have problems with these cases. </p> <p>We have spent the last 3 years perfecting the filter used by <a href="http://www.inversoft.com/features/profanity-filter/" rel="nofollow">CleanSpeak</a> to ensure it handles all of these cases and we continue to tweak it and make it better. We also spent 8 months perfecting our system for performance and it can handle about 5,000 messages per second. Not to say you can't build something usable, but be prepared to handle a lot of issues that might come up and also to create a system that doesn't use regular expressions.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload