Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Here's sample data:</p> <pre><code>$in = ' &lt;html&gt; &lt;head&gt; &lt;script type="text/javascript"&gt;window.location="somehwere";&lt;/script&gt; &lt;style&gt; .someCSS {border:1px solid black;} &lt;/style&gt; &lt;/head&gt; &lt;body&gt; &lt;p&gt;....&lt;/p&gt; &lt;div&gt; &lt;script type="text/javascript"&gt;document.write("bad stuff");&lt;/script&gt; &lt;/div&gt; &lt;ul&gt; &lt;li&gt;&lt;style type="text/css"&gt;#moreCSS {font-weight:900;}&lt;/style&gt;&lt;/li&gt; &lt;/ul&gt; &lt;/body&gt; &lt;/html&gt;'; </code></pre> <p>And now the spelled-out version:</p> <pre><code>$dom = new DOMDocument('1.0','UTF-8'); $dom-&gt;loadHTML($in); removeByTag($dom,'style'); removeByTag($dom,'script'); var_dump($dom-&gt;saveHTML()); function removeByTag($dom,$tag) { $nodeList = $dom-&gt;getElementsByTagName($tag); removeAll($nodeList); } function removeAll($nodeList) { for ( $i = $nodeList-&gt;length; --$i &gt;=0; ) { removeSelf($nodeList-&gt;item($i)); } } function removeSelf($node) { $node-&gt;parentNode-&gt;removeChild($node); } </code></pre> <p>And an alternate (does the same thing, just no function declarations):</p> <pre><code>$dom = new DOMDocument('1.0','UTF-8'); $dom-&gt;loadHTML($in); for ( $list = $dom-&gt;getElementsByTagName('script'), $i = $list-&gt;length; --$i &gt;=0; ) { $node = $list-&gt;item($i); $node-&gt;parentNode-&gt;removeChild($node); } for ( $list = $dom-&gt;getElementsByTagName('style'), $i = $list-&gt;length; --$i &gt;=0; ) { $node = $list-&gt;item($i); $node-&gt;parentNode-&gt;removeChild($node); } var_dump($dom-&gt;saveHTML()); </code></pre> <p>The trick is to <a href="http://us3.php.net/manual/en/class.domnodelist.php#83390" rel="nofollow">iterate <em>backwards</em> when deleting nodes</a>. And getElementsByTagName will traverse the entire DOM for you, so you don't have to (none of that hasChildNodes, nextSibling, nextChild stuff).</p> <p>Perhaps the best solution is somewhere in between those two extreme examples.</p> <hr> <p>Couldn't help myself, this is probably the best version of my suggestions. It doesn't include an incrementor (<code>$i</code>) to muck things up, and removes from the bottom-up:</p> <pre><code>$dom = new DOMDocument('1.0','UTF-8'); $dom-&gt;loadHTML($in); removeElementsByTagName($dom,'script'); removeElementsByTagName($dom,'style'); function removeElementsByTagName($dom,$tagName) { $list = $dom-&gt;getElementsByTagName($tagName); while ( $node = $list-&gt;item(0) ) { $node-&gt;parentNode-&gt;removeChild($node); } } var_dump($dom-&gt;saveHTML()); </code></pre> <p>As you remove nodes, they get moved up in the child list of the parent, so 1 becomes 0 and 2 becomes 1, etc. Keep doing this (<code>while</code>) until there aren't anymore (<a href="http://us1.php.net/manual/en/domnodelist.item.php" rel="nofollow"><code>-&gt;item</code> returns null</a>). Also wrapped this in a reusable function.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload