Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>In C++0x, <code>char16_t</code> and <code>char32_t</code> will be used to store UTF-16 and UTF-32 and not <code>wchar_t</code>.</p> <p>From the draft n2798:</p> <blockquote> <p><strong>22.2.1.4 Class template codecvt</strong></p> <p>2 The class codecvt is for use when converting from one codeset to another, such as from wide characters to multibyte characters or between wide character encodings such as Unicode and EUC.</p> <p>3 The specializations required in Table 76 (22.1.1.1.1) convert the implementation- defined native character set. codecvt implements a degenerate conversion; it does not convert at all. The specialization <code>codecvt&lt;char16_t, char, mbstate_t&gt;</code> converts between the UTF-16 and UTF-8 encodings schemes, and the specialization <code>codecvt &lt;char32_t, char, mbstate_t&gt;</code> converts between the UTF-32 and UTF-8 encodings schemes. <code>codecvt&lt;wchar_t,char,mbstate_t&gt;</code> converts between the native character sets for narrow and wide characters. Specializations on <code>mbstate_t</code> perform conversion between encodings known to the library implementor. </p> <p>Other encodings can be converted by specializing on a user-defined stateT type. The stateT object can contain any state that is useful to communicate to or from the specialized do_in or do_out members. </p> </blockquote> <p>The <em>thing</em> about <code>wchar_t</code> is that it does not give you any guarantees about the encoding used. It is a type that can hold a multibyte character. Period. If you are going to write software <em>now</em>, you have to live with this compromise. C++0x compliant compilers are yet a far cry. You can always give the VC2010 CTP and g++ compilers a try for what it is worth. Moreover, <code>wchar_t</code> has different sizes on different platforms which is another thing to watch out for (2 bytes on VS/Windows, 4 bytes on GCC/Mac and so on). There is then options like <code>-fshort-wchar</code> for GCC to further complicate the issue. </p> <p>The best solution therefore is to use an existing library. Chasing UNICODE bugs around isn't the best possible use of effort/time. I'd suggest you take a look at:</p> <ul> <li>GNU <a href="http://www.gnu.org/software/libiconv/" rel="nofollow noreferrer">libiconv</a></li> <li>IBM's <a href="http://www-01.ibm.com/software/globalization/icu/" rel="nofollow noreferrer">libicu</a></li> </ul> <p>More on C++0x Unicode string literals <a href="http://en.wikipedia.org/wiki/C%2B%2B0x#New_string_literals" rel="nofollow noreferrer">here</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload