Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I suppose C# and Java produce equal byte arrays. If you have non-ASCII characters, it's not enough to add an additional 0. My example contains a few special characters:</p> <pre><code>var str = "Hell ö € Ω "; var bytes = []; var charCode; for (var i = 0; i &lt; str.length; ++i) { charCode = str.charCodeAt(i); bytes.push((charCode &amp; 0xFF00) &gt;&gt; 8); bytes.push(charCode &amp; 0xFF); } alert(bytes.join(' ')); // 0 72 0 101 0 108 0 108 0 32 0 246 0 32 32 172 0 32 3 169 0 32 216 52 221 30 </code></pre> <p>I don't know if C# places BOM (Byte Order Marks), but if using UTF-16, Java <code>String.getBytes</code> adds following bytes: 254 255.</p> <pre><code>String s = "Hell ö € Ω "; // now add a character outside the BMP (Basic Multilingual Plane) // we take the violin-symbol (U+1D11E) MUSICAL SYMBOL G CLEF s += new String(Character.toChars(0x1D11E)); // surrogate codepoints are: d834, dd1e, so one could also write "\ud834\udd1e" byte[] bytes = s.getBytes("UTF-16"); for (byte aByte : bytes) { System.out.print((0xFF &amp; aByte) + " "); } // 254 255 0 72 0 101 0 108 0 108 0 32 0 246 0 32 32 172 0 32 3 169 0 32 216 52 221 30 </code></pre> <p><strong>Edit:</strong></p> <p>Added a special character (U+1D11E) MUSICAL SYMBOL G CLEF (outside BPM, so taking not only 2 bytes in UTF-16, but 4.</p> <p>Current JavaScript versions use "UCS-2" internally, so this symbol takes the space of 2 normal characters.</p> <p>I'm not sure but when using <code>charCodeAt</code> it seems we get exactly the surrogate codepoints also used in UTF-16, so non-BPM characters are handled correctly.</p> <p>This problem is absolutely non-trivial. It might depend on the used JavaScript versions and engines. So if you want reliable solutions, you should have a look at:</p> <ul> <li><a href="https://github.com/koichik/node-codepoint/">https://github.com/koichik/node-codepoint/</a></li> <li><a href="http://mathiasbynens.be/notes/javascript-escapes">http://mathiasbynens.be/notes/javascript-escapes</a></li> <li>Mozilla Developer Network: charCodeAt</li> <li>BigEndian vs. LittleEndian</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload