Note that there are some explanatory texts on larger screens.

plurals
  1. POIs a DOM Text Node guaranteed to not be interpreted as HTML?
    text
    copied!<p>Does anyone know whether a DOM <code>Node</code> of type <code>Text</code> is guaranteed not be interpreted as HTML by the browser?</p> <p>More details follow.</p> <p><strong>Background</strong></p> <p>I'm building a simple web comment system for a friend, and I've been thinking about XSS attacks. I don't think filtering or escaping HTML tags is a very elegant solution--it's too easy to come up with a convolution that will slip past the filter. The fundamental issue is that I want to guarantee that, for certain pieces of content (i.e. the content that random unauthenticated web users POST), the browser <em>never</em> tries to interpret or run the content.</p> <p><strong>A plain(text) start</strong></p> <p>The first thought that came to mind is just to use <code>Content-Type: text/plain</code>, but this has to apply to a whole page. You can put a plaintext <code>IFRAME</code> in the middle of a page, but it's ugly, and it creates focus problems if the user clicks into the frame.</p> <p><strong>innerText/textContent/JQuery</strong></p> <p>It turns out that there are some browser-specific (<code>innerText</code> in IE, <code>textContent</code> in FF, Safari, etc.) attributes that, when set, are required to create a single <code>Text</code> node. </p> <p>JQuery tries to avoid the difference in browser-specific attributes, by implementing a single function <code>text(val)</code> that skips the browser-specific attributes and goes directly to <code>document.createTextNode(text)</code>, which, as you can guess, creates a <code>Text</code> node.</p> <p><strong>W3 DOM <code>Text</code> <code>Node</code>s</strong></p> <p>So I think this is close to what I want, it looks good--<code>Text</code> nodes can't have children, and it appears like they can't be interpreted as HTML. But I am not 100% sure from the official docs. </p> <ul> <li>Interface <code>Node</code>: <a href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1950641247" rel="noreferrer">http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1950641247</a></li> <li>Interface <code>Text</code>: <a href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1312295772" rel="noreferrer">http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1312295772</a></li> <li><code>textContent</code>: <a href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-textContent" rel="noreferrer">http://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-textContent</a></li> </ul> <p>The part from <code>textContent</code> is particularly encouraging, because it says "on setting, no parsing is performed either, the input string is taken as pure textual content." But is this fundamental to all <code>Text</code> nodes, or only nodes on which you set <code>textContent</code>? This probably seems like a dumb quibble, but it might be important because <em>IE doesn't support <code>textContent</code></em> (see above). </p> <p><strong>Back around to the initial question</strong></p> <p>Can anyone confirm/reject that this will work? That is, that a w3 DOM compliant browser will <em>never</em> interpret a <code>Text</code> node as HTML, no matter what the content? I'd be extremely grateful to have this tormenting little uncertainty resolved.</p> <p>Thank you for your time!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload