Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p><a href="http://code.google.com/apis/gdata/" rel="noreferrer">Google Data API</a> has an <a href="http://code.google.com/apis/gdata/javadoc/com/google/gdata/util/io/base/UnicodeReader.html" rel="noreferrer"><code>UnicodeReader</code></a> which automagically detects the encoding.</p> <p>You can use it instead of <code>InputStreamReader</code>. Here's an -slightly compactized- extract of its source which is pretty straightforward:</p> <pre><code>public class UnicodeReader extends Reader { private static final int BOM_SIZE = 4; private final InputStreamReader reader; /** * Construct UnicodeReader * @param in Input stream. * @param defaultEncoding Default encoding to be used if BOM is not found, * or &lt;code&gt;null&lt;/code&gt; to use system default encoding. * @throws IOException If an I/O error occurs. */ public UnicodeReader(InputStream in, String defaultEncoding) throws IOException { byte bom[] = new byte[BOM_SIZE]; String encoding; int unread; PushbackInputStream pushbackStream = new PushbackInputStream(in, BOM_SIZE); int n = pushbackStream.read(bom, 0, bom.length); // Read ahead four bytes and check for BOM marks. if ((bom[0] == (byte) 0xEF) &amp;&amp; (bom[1] == (byte) 0xBB) &amp;&amp; (bom[2] == (byte) 0xBF)) { encoding = "UTF-8"; unread = n - 3; } else if ((bom[0] == (byte) 0xFE) &amp;&amp; (bom[1] == (byte) 0xFF)) { encoding = "UTF-16BE"; unread = n - 2; } else if ((bom[0] == (byte) 0xFF) &amp;&amp; (bom[1] == (byte) 0xFE)) { encoding = "UTF-16LE"; unread = n - 2; } else if ((bom[0] == (byte) 0x00) &amp;&amp; (bom[1] == (byte) 0x00) &amp;&amp; (bom[2] == (byte) 0xFE) &amp;&amp; (bom[3] == (byte) 0xFF)) { encoding = "UTF-32BE"; unread = n - 4; } else if ((bom[0] == (byte) 0xFF) &amp;&amp; (bom[1] == (byte) 0xFE) &amp;&amp; (bom[2] == (byte) 0x00) &amp;&amp; (bom[3] == (byte) 0x00)) { encoding = "UTF-32LE"; unread = n - 4; } else { encoding = defaultEncoding; unread = n; } // Unread bytes if necessary and skip BOM marks. if (unread &gt; 0) { pushbackStream.unread(bom, (n - unread), unread); } else if (unread &lt; -1) { pushbackStream.unread(bom, 0, 0); } // Use given encoding. if (encoding == null) { reader = new InputStreamReader(pushbackStream); } else { reader = new InputStreamReader(pushbackStream, encoding); } } public String getEncoding() { return reader.getEncoding(); } public int read(char[] cbuf, int off, int len) throws IOException { return reader.read(cbuf, off, len); } public void close() throws IOException { reader.close(); } } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload