Note that there are some explanatory texts on larger screens.

plurals
  1. POXOM canonicalization takes too long
    text
    copied!<p>I have an XML file that can be as big as 1GB. I am using XOM to avoid OutOfMemory Exceptions.</p> <p>I need to canonicalize the entire document, but the canonicalization takes a long time, even for a 1.5 MB file.</p> <p>Here is what I have done:</p> <p>I have this sample XML file and I increase the size of the document by replicating the Item node.</p> <pre><code>&lt;?xml version="1.0" encoding="UTF-8" standalone="no"?&gt; &lt;Packet id="some" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"&gt; &lt;Head&gt; &lt;PacketId&gt;a34567890&lt;/PacketId&gt; &lt;PacketHeadItem1&gt;12345&lt;/PacketHeadItem1&gt; &lt;PacketHeadItem2&gt;1&lt;/PacketHeadItem2&gt; &lt;PacketHeadItem3&gt;18&lt;/PacketHeadItem3&gt; &lt;PacketHeadItem4/&gt; &lt;PacketHeadItem5&gt;12082011111408&lt;/PacketHeadItem5&gt; &lt;PacketHeadItem6&gt;1&lt;/PacketHeadItem6&gt; &lt;/Head&gt; &lt;List id="list"&gt; &lt;Item&gt; &lt;Item1&gt;item1&lt;/Item1&gt; &lt;Item2&gt;item2&lt;/Item2&gt; &lt;Item3&gt;item3&lt;/Item3&gt; &lt;Item4&gt;item4&lt;/Item4&gt; &lt;Item5&gt;item5&lt;/Item5&gt; &lt;Item6&gt;item6&lt;/Item6&gt; &lt;Item7&gt;item7&lt;/Item7&gt; &lt;/Item&gt; &lt;/List&gt; &lt;/Packet&gt; </code></pre> <p>The code I am using for canonicalization is as follows:</p> <pre><code>private static void canonXOM() throws Exception { String file = "D:\\PACKET.xml"; FileInputStream xmlFile = new FileInputStream(file); Builder builder = new Builder(false); Document doc = builder.build(xmlFile); FileOutputStream fos = new FileOutputStream("D:\\canon.xml"); Canonicalizer outputter = new Canonicalizer(fos); System.out.println("Query"); Nodes nodes = doc.getRootElement().query("./descendant-or-self::node()|./@*"); System.out.println("Canon"); outputter.write(nodes); fos.close(); } </code></pre> <p>Even though this code works well for small files, the canonicalization part takes about 7 minutes for a 1.5mb file on my development environment (4gb ram, 64bit, eclipse, windows)</p> <p>Any pointers to the cause of this delay is highly appreciated.</p> <p>PS. I need to canonicalize segments from a whole XML document, as well as the whole document itself. So, using the document itself as the argument does not work for me.</p> <p>Best</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload