Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Actually it is not difficult to patch libxml2 (this walkthrough was done on Ubuntu 11.04 with Python 2.7.3)</p> <p>First define a test program <code>wbr_test.py</code>:</p> <pre><code>from lxml import etree from cStringIO import StringIO wbr_html = """\ &lt;html&gt; &lt;head&gt; &lt;title&gt;wbr test&lt;/title&gt; &lt;/head&gt; &lt;body&gt; Test for a breakable&lt;wbr&gt;word implemenation change &lt;/body&gt; &lt;/html&gt; """ parser = etree.HTMLParser() tree = etree.parse(StringIO(wbr_html), parser) result = etree.tostring(tree.getroot(), pretty_print=True, method="html") if result.split() != wbr_html.split(): # split, as we are not interested in whitespace differences print(result) print("not ok") else: print("OK") </code></pre> <p>Make sure that it fails by running <code>python wbr_test.py</code>. It should insert a <code>&lt;\wbr&gt;</code> before <code>&lt;\body&gt;</code>, and print <code>not ok</code> at the end.</p> <p>Download, extract and compile <code>libxml2</code>:</p> <pre><code>wget ftp://xmlsoft.org/libxml2/libxml2-2.8.0.tar.gz tar xvf libxml2-2.8.0.tar.gz cd libxml2-2.8.0/ ./configure --prefix=/usr make -j8 # adjust number to match your number of cores </code></pre> <p>Install, and install python libxml2 bindings:</p> <pre><code>sudo make install cd to_python_bindings sudo python setup.py install </code></pre> <p>Test your <code>wbr_test.py</code> once more, to make sure it fails with the latest libxml2 version.</p> <p>First make a copy of <code>HTMLparser.c</code> e.g. in <code>/var/tmp</code>.</p> <p>Now edit the the file HTMLparser.c at the toplevel of the libxml2 source. Search for the word <code>forced</code> (only one occurrence). You will be at the <code>&lt;br&gt;</code> tag definition. Copy the three lines starting with the line you just found. The most appropriate insert point is just before the end (after the definition of <code>&lt;var&gt;</code>). To get the final comma right in the table insert the three lines before the one with just <code>'}'</code> not the one with <code>'};'</code>.</p> <p>In the newly inserted code Replace <code>br</code> with <code>wbr</code> and change <code>DECL clear_attrs</code> to <code>NULL</code> (assuming that a new tag does not have deprecated attributes).</p> <p>The result should diff with the version in <code>/var/tmp</code> ( <code>diff -u HTMLparser.c /var/tmp</code>) as follows:</p> <pre><code>@@ -1039,6 +1039,9 @@ }, { "var", 0, 0, 0, 0, 0, 0, 1, "instance of a variable or program argument", DECL html_inline, NULL, DECL html_attrs, NULL, NULL +}, +{ "wbr", 0, 2, 2, 1, 0, 0, 1, "possible line break ", + EMPTY , NULL , DECL core_attrs, NULL , NULL } }; </code></pre> <p>Make and install:</p> <pre><code>make &amp;&amp; sudo make install </code></pre> <p>Test your <code>wbr_test.py</code> once more. Should show <code>OK</code></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload