Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to locate error after MalformedByteSequenceException thrown by XML parser
    primarykey
    data
    text
    <p>I'm getting a MalformedByteSequenceException when parsing an XML file.</p> <p>My app allows external customers to submit XML files. They can use any supported encoding but most specify <code>...encoding="UTF-8"...</code> at the top of the file as per the examples that were provided to them. But then some will use windows-1252 to encode their data which will cause a MalformedByteSequenceException for non-ascii characters.</p> <p>I want to use the XML parser to identify the file encoding and decode the file so I don't want to have a preliminary step of testing the encoding or of converting the InputStream to a Reader. I feel that the XML parser should handle that step.</p> <p>Even though I have declared a ValidationEventHandler, it is not called when a MalformedByteSequenceException.</p> <p>Is there any way of getting the Unmarshaller to report the location in the file where the error occurs?</p> <p>Here is my Java code:</p> <pre><code>InputStream input = ... JAXBContext jc = JAXBContext.newInstance(MyClass.class.getPackage().getName()); Unmarshaller unmarshaller = jc.createUnmarshaller(); SchemaFactory sf = SchemaFactory.newInstance(javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI); Source source = new StreamSource(getClass().getResource("my.xsd").toExternalForm()); Schema schema = sf.newSchema(sources); unmarshaller.setSchema(schema); ValidationEventHandler handler = new MyValidationEventHandler(); unmarshaller.setEventHandler(handler); MyClass myClass = (MyClass) unmarshaller.unmarshal(input); </code></pre> <p>and the resulting stack-trace</p> <pre><code>javax.xml.bind.UnmarshalException - with linked exception: [com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 4-byte UTF-8 sequence.] at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:202) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:173) at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:137) at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184) at (my code) Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 4-byte UTF-8 sequence. at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:470) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanContent(XMLEntityScanner.java:916) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2788) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:200) ... 51 more </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload