Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>When you write a code like <code>string abc= "मेरा";</code>, you already have it as Unicode (specifically, UTF-16), so you don't have to convert anything. If you want to access the singular characters, you can do that using normal index: e.g. <code>abc[1]</code> is <code>े</code> (DEVANAGARI VOWEL SIGN E).</p> <p>If you want to see the numeric representations of those characters, just cast them to integers. For example</p> <pre><code>abc.Select(c =&gt; (int)c) </code></pre> <p>gives the sequence of numbers 2350, 2375, 2352, 2366. If you want to see the hexadecimal representation of those numbers, use <code>ToString()</code>: </p> <pre><code>abc.Select(c =&gt; ((int)c).ToString("x4")) </code></pre> <p>returns the sequence of strings "092e", "0947", "0930", "093e".</p> <p>Note that when I said numeric representations, I actually meant their encoding using UTF-16. For characters in the <a href="http://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane" rel="nofollow noreferrer">Basic Multilingual Plane</a>, this is the same as their Unicode code point. The vast majority of used characters lie in BMP, including those 4 Hindi characters presented here. </p> <p>If you wanted to handle characters in other planes too, you could use code like the following.</p> <pre><code>byte[] bytes = Encoding.UTF32.GetBytes(abc); int codePointCount = bytes.Length / 4; int[] codePoints = new int[codePointCount]; for (int i = 0; i &lt; codePointCount; i++) codePoints[i] = BitConverter.ToInt32(bytes, i * 4); </code></pre> <p>Since UTF-32 encodes all (21-bit) code points directly, this will give you them. (Maybe there is a more straightforward solution, but I haven't found one.)</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload