StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
7539198
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
18
CommunityOwnedDate
CreationDate
2011-09-24T13:04:42.537
FavoriteCount
0
LastActivityDate
2011-10-23T15:47:59.093
LastEditDate
2017-05-23T12:32:23.620
LastEditorUserId
-1
OwnerUserId
938089
ParentId
7474710
PostTypeId
2
Score
78
ViewCount
0
LastEditorDisplayName
text
Body
<p><strong>Fiddle</strong>: <a href="http://jsfiddle.net/JFSKe/6/" rel="nofollow noreferrer">http://jsfiddle.net/JFSKe/6/</a></p> <p><strong><a href="https://developer.mozilla.org/En/DOM/DocumentFragment" rel="nofollow noreferrer"><code>DocumentFragment</code></a></strong> doesn't implement DOM methods. Using <code>document.createElement</code> in conjunction with <code>innerHTML</code> removes the <code><head></code> and <code><body></code> tags (even when the created element is a root element, <code><html></code>). Therefore, the solution should be sought elsewhere. I have created a <strong>cross-browser</strong> string-to-DOM function, which makes use of an invisible inline-frame.</p> <p>All external resources and scripts will be disabled. See <em>Explanation of the code</em> for more information.</p> <h3>Code</h3> <pre><code>/* @param String html The string with HTML which has be converted to a DOM object @param func callback (optional) Callback(HTMLDocument doc, function destroy) @returns undefined if callback exists, else: Object HTMLDocument doc DOM fetched from Parameter:html function destroy Removes HTMLDocument doc. */ function string2dom(html, callback){ /* Sanitise the string */ html = sanitiseHTML(html); /*Defined at the bottom of the answer*/ /* Create an IFrame */ var iframe = document.createElement("iframe"); iframe.style.display = "none"; document.body.appendChild(iframe); var doc = iframe.contentDocument || iframe.contentWindow.document; doc.open(); doc.write(html); doc.close(); function destroy(){ iframe.parentNode.removeChild(iframe); } if(callback) callback(doc, destroy); else return {"doc": doc, "destroy": destroy}; } /* @name sanitiseHTML @param String html A string representing HTML code @return String A new string, fully stripped of external resources. All "external" attributes (href, src) are prefixed by data- */ function sanitiseHTML(html){ /* Adds a <!-\"'--> before every matched tag, so that unterminated quotes aren't preventing the browser from splitting a tag. Test case: '<input style="foo;b:url(0);><input onclick="<input type=button onclick="too() href=;>">' */ var prefix = ""; /*Attributes should not be prefixed by these characters. This list is not complete, but will be sufficient for this function. (see http://www.w3.org/TR/REC-xml/#NT-NameChar) */ var att = "[^-a-z0-9:._]"; var tag = "<[a-z]"; var any = "(?:[^<>\"']*(?:\"[^\"]*\"|'[^']*'))*?[^<>]*"; var etag = "(?:>|(?=<))"; /* @name ae @description Converts a given string in a sequence of the original input and the HTML entity @param String string String to convert */ var entityEnd = "(?:;|(?!\\d))"; var ents = {" ":"(?:\\s|&nbsp;?|&#0*32"+entityEnd+"|&#x0*20"+entityEnd+")", "(":"(?:\$|&#0*40"+entityEnd+"|&#x0*28"+entityEnd+")", ")":"(?:\$|&#0*41"+entityEnd+"|&#x0*29"+entityEnd+")", ".":"(?:\\.|&#0*46"+entityEnd+"|&#x0*2e"+entityEnd+")"}; /*Placeholder to avoid tricky filter-circumventing methods*/ var charMap = {}; var s = ents[" "]+"*"; /* Short-hand space */ /* Important: Must be pre- and postfixed by < and >. RE matches a whole tag! */ function ae(string){ var all_chars_lowercase = string.toLowerCase(); if(ents[string]) return ents[string]; var all_chars_uppercase = string.toUpperCase(); var RE_res = ""; for(var i=0; i<string.length; i++){ var char_lowercase = all_chars_lowercase.charAt(i); if(charMap[char_lowercase]){ RE_res += charMap[char_lowercase]; continue; } var char_uppercase = all_chars_uppercase.charAt(i); var RE_sub = [char_lowercase]; RE_sub.push("&#0*" + char_lowercase.charCodeAt(0) + entityEnd); RE_sub.push("&#x0*" + char_lowercase.charCodeAt(0).toString(16) + entityEnd); if(char_lowercase != char_uppercase){ RE_sub.push("&#0*" + char_uppercase.charCodeAt(0) + entityEnd); RE_sub.push("&#x0*" + char_uppercase.charCodeAt(0).toString(16) + entityEnd); } RE_sub = "(?:" + RE_sub.join("|") + ")"; RE_res += (charMap[char_lowercase] = RE_sub); } return(ents[string] = RE_res); } /* @name by @description second argument for the replace function. */ function by(match, group1, group2){ /* Adds a data-prefix before every external pointer */ return group1 + "data-" + group2 } /* @name cr @description Selects a HTML element and performs a search-and-replace on attributes @param String selector HTML substring to match @param String attribute RegExp-escaped; HTML element attribute to match @param String marker Optional RegExp-escaped; marks the prefix @param String delimiter Optional RegExp escaped; non-quote delimiters @param String end Optional RegExp-escaped; forces the match to end before an occurence of <end> when quotes are missing */ function cr(selector, attribute, marker, delimiter, end){ if(typeof selector == "string") selector = new RegExp(selector, "gi"); marker = typeof marker == "string" ? marker : "\\s*="; delimiter = typeof delimiter == "string" ? delimiter : ""; end = typeof end == "string" ? end : ""; var is_end = end && "?"; var re1 = new RegExp("("+att+")("+attribute+marker+"(?:\\s*\"[^\""+delimiter+"]*\"|\\s*'[^'"+delimiter+"]*'|[^\\s"+delimiter+"]+"+is_end+")"+end+")", "gi"); html = html.replace(selector, function(match){ return prefix + match.replace(re1, by); }); } /* @name cri @description Selects an attribute of a HTML element, and performs a search-and-replace on certain values @param String selector HTML element to match @param String attribute RegExp-escaped; HTML element attribute to match @param String front RegExp-escaped; attribute value, prefix to match @param String flags Optional RegExp flags, default "gi" @param String delimiter Optional RegExp-escaped; non-quote delimiters @param String end Optional RegExp-escaped; forces the match to end before an occurence of <end> when quotes are missing */ function cri(selector, attribute, front, flags, delimiter, end){ if(typeof selector == "string") selector = new RegExp(selector, "gi"); flags = typeof flags == "string" ? flags : "gi"; var re1 = new RegExp("("+att+attribute+"\\s*=)((?:\\s*\"[^\"]*\"|\\s*'[^']*'|[^\\s>]+))", "gi"); end = typeof end == "string" ? end + ")" : ")"; var at1 = new RegExp('(")('+front+'[^"]+")', flags); var at2 = new RegExp("(')("+front+"[^']+')", flags); var at3 = new RegExp("()("+front+'(?:"[^"]+"|\'[^\']+\'|(?:(?!'+delimiter+').)+)'+end, flags); var handleAttr = function(match, g1, g2){ if(g2.charAt(0) == '"') return g1+g2.replace(at1, by); if(g2.charAt(0) == "'") return g1+g2.replace(at2, by); return g1+g2.replace(at3, by); }; html = html.replace(selector, function(match){ return prefix + match.replace(re1, handleAttr); }); } /* <meta http-equiv=refresh content=" ; url= " > */ html = html.replace(new RegExp("<meta"+any+att+"http-equiv\\s*=\\s*(?:\""+ae("refresh")+"\""+any+etag+"|'"+ae("refresh")+"'"+any+etag+"|"+ae("refresh")+"(?:"+ae(" ")+any+etag+"|"+etag+"))", "gi"), ""); /* Stripping all scripts */ html = html.replace(new RegExp("<script"+any+">\\s*//\\s*<\\[CDATA\\[[\\S\\s]*?]]>\\s*</script[^>]*>", "gi"), ""); html = html.replace(/<script[\S\s]+?<\/script\s*>/gi, ""); cr(tag+any+att+"on[-a-z0-9:_.]+="+any+etag, "on[-a-z0-9:_.]+"); /* Event listeners */ cr(tag+any+att+"href\\s*="+any+etag, "href"); /* Linked elements */ cr(tag+any+att+"src\\s*="+any+etag, "src"); /* Embedded elements */ cr("<object"+any+att+"data\\s*="+any+etag, "data"); /* <object data= > */ cr("<applet"+any+att+"codebase\\s*="+any+etag, "codebase"); /* <applet codebase= > */ /* <param name=movie value= >*/ cr("<param"+any+att+"name\\s*=\\s*(?:\""+ae("movie")+"\""+any+etag+"|'"+ae("movie")+"'"+any+etag+"|"+ae("movie")+"(?:"+ae(" ")+any+etag+"|"+etag+"))", "value"); /* <style> and < style= > url()*/ cr(/<style[^>]*>(?:[^"']*(?:"[^"]*"|'[^']*'))*?[^'"]*(?:<\/style|$)/gi, "url", "\\s*\$\\s*", "", "\\s*\$"); cri(tag+any+att+"style\\s*="+any+etag, "style", ae("url")+s+ae("(")+s, 0, s+ae(")"), ae(")")); /* IE7- CSS expression() */ cr(/<style[^>]*>(?:[^"']*(?:"[^"]*"|'[^']*'))*?[^'"]*(?:<\/style|$)/gi, "expression", "\\s*\$\\s*", "", "\\s*\$"); cri(tag+any+att+"style\\s*="+any+etag, "style", ae("expression")+s+ae("(")+s, 0, s+ae(")"), ae(")")); return html.replace(new RegExp("(?:"+prefix+")+", "g"), prefix); } </code></pre> <h3>Explanation of the code</h3> <p>The <code>sanitiseHTML</code> function is based on my <code>replace_all_rel_by_abs</code> function (see <a href="http://msdn.microsoft.com/en-us/scriptjunkie/gg278167" rel="nofollow noreferrer">this answer</a>). The <code>sanitiseHTML</code> function is completely rewritten though, in order to achieve maximum efficiency and reliability.</p> <p>Additionally, a new set of RegExps are added to remove all scripts and event handlers (including CSS <code>expression()</code>, IE7-). To make sure that all tags are parsed as expected, the adjusted tags are prefixed by <code></code>. This prefix is necessary to correctly parse nested "event handlers" in conjunction with unterminated quotes: <code><a id="><input onclick="<div onmousemove=evil()>"></code>.</p> <p>These RegExps are dynamically created using an internal function <code>cr</code>/<code>cri</code> (<strong>C</strong>reate <strong>R</strong>eplace [<strong>I</strong>nline]). These functions accept a list of arguments, and create and execute an advanced RE replacement. To make sure that HTML entities aren't breaking a RegExp (<code>refresh</code> in <code><meta http-equiv=refresh></code> could be written in various ways), the dynamically created RegExps are partially constructed by function <code>ae</code> (<strong>A</strong>ny <strong>E</strong>ntity).<br /> The actual replacements are done by function <code>by</code> (replace <strong>by</strong>). In this implementation, <code>by</code> adds <code>data-</code> before all matched attributes.</p> <ol> <li>All <code><script>//<[CDATA[ .. //]]></script></code> occurrences are striped. This step is necessary, because <code>CDATA</code> sections allow <code></script></code> strings inside the code. After this replacement has been executed, it's safe to go to the next replacement:</li> <li>The remaining <code><script>...</script></code> tags are removed.</li> <li>The <code><meta http-equiv=refresh .. ></code> tag is removed</li> <li><p><strong>All</strong> event listeners and external pointers/attributes (<code>href</code>, <code>src</code>, <code>url()</code>) are prefixed by <code>data-</code>, as described previously.</p></li> <li><p>An <code>IFrame</code> object is created. IFrames are less likely to leak memory (contrary to the htmlfile ActiveXObject). The IFrame becomes invisible, and is appended to the document, so that the DOM can be accessed. <code>document.write()</code> are used to write HTML to the IFrame. <code>document.open()</code> and <code>document.close()</code> are used to empty the previous contents of the document, so that the generated document is an exact copy of the given <code>html</code> string.</p></li> <li>If a callback function has been specified, the function will be called with two arguments. The <em>first</em> argument is a reference to the generated <code>document</code> object. The <em>second</em> argument is a function, which destroys the generated DOM tree when called. This function should be called when you don't need the tree any more.<br />If the callback function isn't specified, the function returns an object consisting of two properties (<code>doc</code> and <code>destroy</code>), which behave the same as the previously mentioned arguments.</li> </ol> <h3>Additional notes</h3> <ul> <li>Setting the <code>designMode</code> property to "On" will stop a frame from executing scripts (not supported in Chrome). If you have to preserve the <code><script></code> tags for a specific reason, you can use <code>iframe.designMode = "On"</code> instead of the script stripping feature.</li> <li>I wasn't able to find a reliable source for the <code>htmlfile activeXObject</code>. According to <a href="http://msdn.microsoft.com/en-us/scriptjunkie/gg278167" rel="nofollow noreferrer">this source</a>, <code>htmlfile</code> is slower than IFrames, and more susceptible to memory leaks.<br /><br /></li> <li>All affected attributes (<code>href</code>, <code>src</code>, ...) are prefixed by <code>data-</code>. An example of getting/changing these attributes is shown for <code>data-href</code>:<br /><code>elem.getAttribute("data-href")</code> and <code>elem.setAttribute("data-href", "...")</code><br /><code>elem.dataset.href</code> and <code>elem.dataset.href = "..."</code>.</li> <li>External resources have been disabled. As a result, the page may look completely different:<br/><strike><code><link rel="stylesheet" href="main.css" /></code></strike> <em>No external styles</em><br /><strike><code><script>document.body.bgColor="red";</script></code></strike> <em>No scripted styles</em><br /><code><img src="128x128.png" /></code> <em>No images: the size of the element may be completely different.</em></li> </ul> <h3>Examples</h3> <p><strong><code>sanitiseHTML(html)</code></strong><br /> Paste this bookmarklet in the location's bar. It will offer an option to inject a textarea, showing the sanitised HTML string.</p> <pre><code>javascript:void(function(){var s=document.createElement("script");s.src="http://rob.lekensteyn.nl/html-sanitizer.js";document.body.appendChild(s)})(); </code></pre> <p><strong>Code examples - <code>string2dom(html)</code></strong>:</p> <pre><code>string2dom("<html><head><title>Test</title></head></html>", function(doc, destroy){ alert(doc.title); /* Alert: "Test" */ destroy(); }); var test = string2dom("<div id='secret'></div>"); alert(test.doc.getElementById("secret").tagName); /* Alert: "DIV" */ test.destroy(); </code></pre> <h3>Notable references</h3> <ul> <li><a href="https://stackoverflow.com/questions/7544550/javascript-regex-to-change-all-relative-urls-to-absolute/7544757#7544757">SO: JS RE to change all relative to absolute URLs</a> - Function <code>sanitiseHTML(html)</code> is based on my previously created <code>replace_all_rel_by_abs(html)</code> function.</li> <li><a href="http://www.w3.org/wiki/HTML/Elements#Embedded_content" rel="nofollow noreferrer">Elements - Embedded content</a> - A full list of standard embedded elements</li> <li><a href="http://www.w3.org/wiki/HTML/Elements#Previous_HTML_html.2FElement" rel="nofollow noreferrer">Elements - Previous HTML elements</a> - An additional list of (deprecated) elements (such as <code><applet></code>)</li> <li><a href="http://msdn.microsoft.com/en-us/scriptjunkie/gg278167" rel="nofollow noreferrer">The htmlfile ActiveX object</a> - <em>"Slower than iframe sandboxes. Leaks memory if not managed"</em></li> </ul>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POCan I load an entire HTML document into a document fragment in Internet Explorer?
  singulars
  PostTypePostTypeId
  PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USRob W
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POCan I load an entire HTML document into a document fragment in Internet Explorer?
  singulars
  PostTypePostTypeId
  PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
2. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
3. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.