Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Encoding in C++ is quite a bit complicated. Here is my understanding of it.</p> <p>Every implementation has to support characters from the <em>basic source character set</em>. These include common characters listed in §2.2/1 (§2.3/1 in C++11). These characters should all fit into one <code>char</code>. In addition implementations have to support a way to name other characters using a way called <code>universal-character-names</code> and look like <code>\uffff</code> or <code>\Uffffffff</code> and can be used to refer to Unicode characters. A subset of them are usable in identifiers (listed in Annex E). </p> <p>This is all nice, but the mapping from characters in the file, to source characters (used at compile time) is implementation defined. This constitutes the encoding used. Here is what it says literally (C++98 version):</p> <blockquote> <p>Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences (2.3) are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set (2.2) is replaced by the universal-character-name that des- ignates that character. (An implementation may use any internal encoding, so long as an actual extended character encountered in the source file, and the same extended character expressed in the source file as a universal-character-name (i.e. using the \uXXXX notation), are handled equivalently.)</p> </blockquote> <p>For gcc, you can change it using the option <code>-finput-charset=charset</code>. Additionally, you can change the execution character used to represet values at runtime. The proper option for this is <code>-fexec-charset=charset</code> for char (it defaults to <code>utf-8</code>) and <code>-fwide-exec-charset=charset</code> (which defaults to either <code>utf-16</code> or <code>utf-32</code> depending on the size of <code>wchar_t</code>). </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload