Note that there are some explanatory texts on larger screens.

plurals
  1. POJava UTF-8 filenames with IBM JVM (AIX)
    primarykey
    data
    text
    <p>I'm having trouble understanding the way the IBM JVM's implementation of <code>java.io.File</code> deals with UTF-8 on AIX on the JFS2 filesystem. I suspect there's a system property that I'm overlooking, but I have not yet been able to find it.</p> <p>Let's assume I have a file named <code>othér</code> (where <code>é</code> is U+00E9 or UTF-8 bytes<code>0xc3 0xa9</code>). The filename is encoded in UTF-8, and was created by a C program:</p> <pre><code>char filename[] = { 'o', 't', 'h', 0xc3, 0xa9, 'r', 0 }; open(filename, O_RDWR|O_CREAT, 0666); </code></pre> <p>If I create a Unicode string in Java that is representative of the filename, it fails to open it. Further, if I use <code>File.listFiles()</code> in Java, it insists on treating this as a Latin1 string. For example:</p> <pre><code>String expectedName = new String(new char[] { 'o', 't', 'h', 0xe9, 'r' }); File expected = new File(expectedName); if (expected.exists()) System.out.println(expectedName + " exists"); else System.out.println(expectedName + " DOES NOT exist"); for (File child : new File(".").listFiles()) { System.out.println(child.getName()); System.out.print("Chars:"); for (char c : child.getName().toCharArray()) System.out.print(" 0x" + Integer.toHexString((int)c)); System.out.println(); } </code></pre> <p>The results of this program are:</p> <pre><code>% java -Dfile.encoding=UTF8 FileTest othér DOES NOT exist othér Chars: 0x6f 0x74 0x68 0xc3 0xa9 0x72 </code></pre> <p>So it appears that my filenames are getting treated as Latin1. I've tried setting the <a href="http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=/rzaha/sysprop2.htm" rel="noreferrer"><code>file.encoding</code></a> system property to <code>UTF8</code> and the <a href="http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=/rzatz/51/admin/help/trun_svr_utf.html" rel="noreferrer"><code>client.encoding.override</code></a> system property to <code>UTF-8</code> to no avail. My <code>LANG</code> and <code>LC_ALL</code> settings are <code>en_US.UTF-8</code>:</p> <pre><code>% echo $LANG en_US.UTF-8 % echo $LC_ALL en_US.UTF-8 </code></pre> <p>My system's "Primary Language Environment", as configured by SMIT, is "ISO8859-1". I don't really know the full impact this setting has, but I cannot change it. I suspect that if I <em>could</em> change this to "UTF8 English" then that <em>may</em> fix the problem, but since JFS2 stores filenames in Unicode and Java operates in Unicode internally, I feel like there should be a more general solution to the problem.</p> <p>Is there another system property to J9 that I can set that will make force it to use UTF-8 filenames regardless of my SMIT setting?</p> <p>AIX version is 5.2, Java version is IBM J9 (1.5.0), filesystem is JFS2:</p> <pre><code>rs6000% uname -a AIX rs6000 2 5 000A9B7C4C00 rs6000% java -version java version "1.5.0" Java(TM) 2 Runtime Environment, Standard Edition (build pap32dev-20091106a (SR11 )) IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 AIX ppc-32 j9vmap3223-20091104 (JIT enabled) J9VM - 20091103_45935_bHdSMr JIT - 20091016_1845_r8 GC - 20091026_AA) JCL - 20091106 rs6000% mount|grep /home /dev/hd1 /home jfs2 Jun 27 16:02 rw,log=/dev/hd8 </code></pre> <p>Update: this still occurs on Java6:</p> <pre><code>% java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pap3260sr11-20120806_01(SR11)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc-32 jvmap3260sr11-20120801_118201 (JIT enabled, AOT enabled) J9VM - 20120801_118201 JIT - r9_20120608_24176ifx1 GC - 20120516_AA) JCL - 20120713_01 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload