Note that there are some explanatory texts on larger screens.

plurals
  1. POHow safe is it to accept a pre-defined set of non-harmful HTML tags from a request?
    text
    copied!<p>One of the first things I learned as a web developer was to never ever accept any HTML from the client. (Perhaps only if I HTML encode it.)<br> I use a WYSIWYG editor (TinyMCE) that outputs HTML. So far I have only used it on an admin page, but now I'd like to also use it on a forum. It has a BBCode module, but that seems to be incomplete. (It is possible that BBCode itself doesn't support everything I want it to.)</p> <p>So, here's my idea:</p> <p>I allow the client to directly POST some HTML code. Then, I check the code for sanity (<strong>well-formedness</strong>) and remove all tags, attributes, and CSS rules that are not allowed based on a pre-defined set of allowed tags and styles.<br> Obviously I would allow the stuff that can be outputted by the subset of TinyMCE functionality I use.</p> <p>I would allow the following tags:<br> <code>span</code>, <code>sub</code>, <code>sup</code>, <code>a</code>, <code>p</code>, <code>ul</code>, <code>ol</code>, <code>li</code>, <code>img</code>, <code>strong</code>, <code>em</code>, <code>br</code></p> <p>With the following attributes:<br> <code>style</code> (for everything), <code>href</code> and <code>title</code> (for <code>a</code>), <code>alt</code> and <code>src</code> (for <code>img</code>)</p> <p>And the following CSS rules:<br> <code>color</code>, <code>font</code>, <code>font-size</code>, <code>font-weight</code>, <code>font-style</code>, <code>text-decoration</code></p> <p>These cover everything that I need for formatting, and (as far as I know) don't present any security risk. Basically, the enforcement of well-formedness and the lack of any layouting styles prevent anyone to hurt the layout of the site. The disallow of the script tag and the likes prevent XSS.<br> (One exception: maybe I should allow <code>width</code>/<code>height</code> in a predefined range for images.)</p> <p><strong>Other advantage:</strong> this stuff would save me from the need to write / look for a BBCode-Html converter.</p> <p>What do you think?<br> Is this a secure thing to do?</p> <p>(As I see, StackOverflow also allows some basic HTML in the "About Me" field, so I think I'm not the first one to implement this.)</p> <p><strong>EDIT:</strong></p> <p>I found <a href="https://stackoverflow.com/questions/3452322/method-to-strip-html-tags-not-in-a-safe-list">this answer</a> which explains how to do this fairly easily.<br> And of course, <a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454">noone should think about using regex for this</a>.</p> <p>The question itself is not related to any language or technology, but if you are wondering, I write this application in ASP.NET.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload