Note that there are some explanatory texts on larger screens.

plurals
  1. POJava - SAX parser on a XHTML document
    text
    copied!<p>I'm trying to write a SAX parser for an XHTML document that I download from the web. At first I was having a problem with the doctype declaration (I found out from <a href="https://stackoverflow.com/questions/998280/dtd-download-error-while-parsing-xhtml-document-in-xom">here</a> that it was because W3C have intentionally blocked access to the DTD), but I fixed that with:</p> <pre><code>XMLReader reader = parser.getXMLReader(); reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl",true); </code></pre> <p>However, now I'm experiencing a second problem. The SAX parser throws an exception when it reaches some Javascript embedded in the XHTML document:</p> <pre><code> &lt;script type="text/javascript" language="JavaScript"&gt; function checkForm() { answer = true; if (siw &amp;&amp; siw.selectingSomething) answer = false; return answer; }// &lt;/script&gt; </code></pre> <p>Specifically the parser throws an error once it reaches the &amp;&amp;'s, as it's expecting an entity reference. The exact exception is:</p> <pre><code>`org.xml.sax.SAXParseException: The entity name must immediately follow the '&amp;' in the entity reference. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:391) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1390) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1814) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3000) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:624) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:486) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:810) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:740) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:110) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1208) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:525) at MLIAParser.readPage(MLIAParser.java:55) at MLIAParser.main(MLIAParser.java:75)` </code></pre> <p>I suspect (but I don't know) that if I hadn't disabled the DTD then I wouldn't get this error. So, how can I avoid the DTD error and avoid the entity reference error?</p> <p>Cheers,</p> <p>Pete</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload