Note that there are some explanatory texts on larger screens.

plurals
  1. POFile.list() retrieves file names with NON-ASCII characters incorrectly on Mac OS X when using Java 7 from Oracle
    text
    copied!<p>I have a problem using File.list() with file names with NON-ASCII characters incorrectly retrieved on Mac OS X when using Java 7 from Oracle.</p> <p>I am using the following example:</p> <pre><code>import java.io.*; import java.util.*; public class ListFiles { public static void main(String[] args) { try { File folder = new File("."); String[] listOfFiles = folder.list(); for (int i = 0; i &lt; listOfFiles.length; i++) { System.out.println(listOfFiles[i]); } Map&lt;String, String&gt; env = System.getenv(); for (String envName : env.keySet()) { System.out.format("%s=%s%n", envName, env.get(envName)); } } catch (Exception e) { e.printStackTrace(); } } } </code></pre> <p>Running this example with Java 6 from Apple, everything is fine:</p> <pre><code>.... Folder-ÄÖÜäöüß 吃饭.txt .... </code></pre> <p>Running this example with Java 7 from Oracle, the result is as follows:</p> <pre><code>.... Folder-A��O��U��a��o��u���� ������.txt .... </code></pre> <p>But, if I set the environment as follows (not set in the two cases above):</p> <pre><code>LANG=en_US.UTF-8 </code></pre> <p>the result with Java 7 from Oracle is as expected:</p> <pre><code>.... Folder-ÄÖÜäöüß 吃饭.txt .... </code></pre> <p>My problem is that I don't want to set the LANG environment variable. It's a GUI application that I want to deploy as an Mac OS X application, and doing so, the LSEnvironment setting </p> <pre><code>&lt;key&gt;LSEnvironment&lt;/key&gt; &lt;dict&gt; &lt;key&gt;LANG&lt;/key&gt; &lt;string&gt;en_US.UTF-8&lt;/string&gt; &lt;/dict&gt; </code></pre> <p>in Info.plist takes no effect (see also <a href="https://stackoverflow.com/questions/10535085/lsenvironment-section-of-info-plist-take-no-effects">here</a>)</p> <p>What can I do to retrieve the file names correctly in Java 7 from Oracle on Mac OS X without having to set the LANG environment? In Windows and Linux, this problem does not exist.</p> <p><strong>EDIT:</strong></p> <p>If I print the individual bytes with:</p> <pre><code>byte[] x = listOfFiles[i].getBytes(); for (int j = 0; j &lt; x.length; j++) { System.out.format("%02X",x[j]); System.out.print(" "); } System.out.println(); </code></pre> <p>the correct results are: </p> <pre><code>Folder-ÄÖÜäöüß 46 6F 6C 64 65 72 2D 41 CC 88 4F CC 88 55 CC 88 61 CC 88 6F CC 88 75 CC 88 C3 9F 吃饭.txt E5 90 83 E9 A5 AD 2E 74 78 74 </code></pre> <p>and the wrong results are:</p> <pre><code>Folder-A��O��U��a��o��u���� 46 6F 6C 64 65 72 2D 41 EF BF BD EF BF BD 4F EF BF BD EF BF BD 55 EF BF BD EF BF BD 61 EF BF BD EF BF BD 6F EF BF BD EF BF BD 75 EF BF BD EF BF BD EF BF BD EF BF BD ������.txt EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD 2E 74 78 74 </code></pre> <p>So one can see that Files.list() replaces some bytes with UTF-8 "EF BF BD" = Unicode U+FFFD = Replacement Character, if LANG is not set (only Java 7 from Oracle).</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload