Note that there are some explanatory texts on larger screens.

plurals
  1. POParse HTML and preserve original content
    primarykey
    data
    text
    <p>I have lots of HTML files. I want to replace some elements, keeping all the other content unchanged. For example, I would like to execute this jQuery expression (or some equivalent of it):</p> <pre><code>$('.header .title').text('my new content') </code></pre> <p>on the following HTML document:</p> <pre><code>&lt;div class=header&gt;&lt;span class=title&gt;Foo&lt;/span&gt;&lt;/div&gt; &lt;p&gt;1&lt;p&gt;2 &lt;table&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt; </code></pre> <p>and have the following result:</p> <pre><code>&lt;div class=header&gt;&lt;span class=title&gt;my new content&lt;/span&gt;&lt;/div&gt; &lt;p&gt;1&lt;p&gt;2 &lt;table&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt; </code></pre> <p>The problem is, all parsers I’ve tried (<a href="http://nokogiri.org/" rel="noreferrer">Nokogiri</a>, <a href="http://www.crummy.com/software/BeautifulSoup/" rel="noreferrer">BeautifulSoup</a>, <a href="https://code.google.com/p/html5lib" rel="noreferrer">html5lib</a>) serialize it to something like this:</p> <pre><code>&lt;html&gt; &lt;head&gt;&lt;/head&gt; &lt;body&gt; &lt;div class=header&gt;&lt;span class=title&gt;my new content&lt;/span&gt;&lt;/div&gt; &lt;p&gt;1&lt;/p&gt;&lt;p&gt;2&lt;/p&gt; &lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; &lt;/body&gt; &lt;/html&gt; </code></pre> <p>E.g. they add:</p> <ol> <li>html, head and body elements</li> <li>closing p tags</li> <li>tbody</li> </ol> <p>Is there a parser that satisfies my needs? It should work in either Node.js, Ruby or Python.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload