Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>There are two aspects to this question.</p> <ol> <li><p>Distinguish UTF-8-encoded characters from ordinary ASCII characters.</p> <p>UTF-8 encodes any code point higher than 127 as a series of two or more bytes. Values at 127 and lower remain untouched. The resultant bytes from the encoding are also higher than 127, so it is sufficient to check a byte's high bit to see whether it qualifies.</p></li> <li><p>Display the encoded characters in hexadecimal.</p> <p>C++ has <a href="http://dinkumware.com/manuals/?manual=compleat&amp;page=ios.html#hex" rel="nofollow noreferrer"><code>std::hex</code></a> to tell streams to format numeric values in hexadecimal. You can use <a href="http://dinkumware.com/manuals/?manual=compleat&amp;page=ios.html#showbase" rel="nofollow noreferrer"><code>std::showbase</code></a> to make the output look pretty. A <code>char</code> isn't treated as numeric, though; streams will just print the character. You'll have to force the value to another numeric type, such as <code>int</code>. Beware of sign-extension, though.</p></li> </ol> <p>Here's some code to demonstrate:</p> <pre><code>#include &lt;iostream&gt; void print_characters(char const* s) { std::cout &lt;&lt; std::showbase &lt;&lt; std::hex; for (char const* pc = s; *pc; ++pc) { if (*pc &amp; 0x80) std::cout &lt;&lt; (*pc &amp; 0xff); else std::cout &lt;&lt; *pc; std::cout &lt;&lt; ' '; } std::cout &lt;&lt; std::endl; } </code></pre> <p>You could call it like this:</p> <pre><code>int main() { char const* test = "ab\xef\xbb\xbfhu"; print_characters(test); return 0; } </code></pre> <p>Output on Solaris 10 with Sun C++ 5.8:</p> <pre> $ ./a.out a b 0xef 0xbb 0xbf h u </pre> <p>The code detects UTF-8-encoded characters, but it makes no effort to decode them; you didn't mention needing to do that.</p> <p>I used <code>*pc &amp; 0xff</code> to convert the expression to an integral type and to mask out the sign-extended bits. Without that, the output on my computer was <code>0xffffffbb</code>, for instance.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload