Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Parsing XML documents using DOM doesn't scale.</p> <p>This <a href="http://groovy.codehaus.org/" rel="nofollow noreferrer">Groovy</a>-script is using StAX (Streaming API for XML) to split an XML document between the top-level elements (that shares the same QName as the first child of the root-document). It's pretty fast, handles arbitrary large documents and is very useful when you want to split a large batch-file into smaller pieces.</p> <p>Requires Groovy on Java 6 or a StAX API and implementation such as <a href="http://woodstox.codehaus.org/" rel="nofollow noreferrer">Woodstox</a> in the CLASSPATH</p> <pre><code>import javax.xml.stream.* pieces = 5 input = "input.xml" output = "output_%04d.xml" eventFactory = XMLEventFactory.newInstance() fileNumber = elementCount = 0 def createEventReader() { reader = XMLInputFactory.newInstance().createXMLEventReader(new FileInputStream(input)) start = reader.next() root = reader.nextTag() firstChild = reader.nextTag() return reader } def createNextEventWriter () { println "Writing to '${filename = String.format(output, ++fileNumber)}'" writer = XMLOutputFactory.newInstance().createXMLEventWriter(new FileOutputStream(filename), start.characterEncodingScheme) writer.add(start) writer.add(root) return writer } elements = createEventReader().findAll { it.startElement &amp;&amp; it.name == firstChild.name }.size() println "Splitting ${elements} &lt;${firstChild.name.localPart}&gt; elements into ${pieces} pieces" chunkSize = elements / pieces writer = createNextEventWriter() writer.add(firstChild) createEventReader().each { if (it.startElement &amp;&amp; it.name == firstChild.name) { if (++elementCount &gt; chunkSize) { writer.add(eventFactory.createEndDocument()) writer.flush() writer = createNextEventWriter() elementCount = 0 } } writer.add(it) } writer.flush() </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload