Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If the question concerns just <code>'9'</code> (or one of the Roman digits), just subtracting <code>'0'</code> is the correct solution. If you're concerned with anything for which <code>iswdigit</code> returns non-zero, however, the issue may be far more complex. The standard says that <code>iswdigit</code> returns a non-zero value if its argument is "a decimal digit wide-character code [in the current local]". Which is vague, and leaves it up to the locale to define exactly what is meant. In the "C" locale or the "Posix" locale, the "Posix" standard, at least, guarantees that only the Roman digits zero through nine are considered decimal digits (if I understand it correctly), so if you're in the "C" or "Posix" locale, just subtracting '0' should work.</p> <p>Presumably, in a Unicode locale, this would be any character which has the general category <code>Nd</code>. There are a number of these. The safest solution would be simply to create something like (variables here with static lifetime):</p> <pre><code>wchar_t const* const digitTables[] = { L"0123456789", L"\u0660\u0661\u0662\u0663\u0664\u0665\u0666\u0667\u0668\u0669", // ... }; //! \return //! wch as a numeric digit, or -1 if it is not a digit int asNumeric( wchar_t wch ) { int result = -1; for ( wchar_t const* const* p = std::begin( digitTables ); p != std::end( digitTables ) &amp;&amp; result == -1; ++ p ) { wchar_t const* q = std::find( *p, *p + 10, wch ); if ( q != *p + 10 ) { result = q - *p; } return result; } </code></pre> <p>If you go this way:</p> <ol> <li>you'll definitely want to download the <code>UnicodeData.txt</code> file from the Unicode consortium ("<a href="http://www.unicode.org/ucd/" rel="noreferrer">Uncode Character Database</a>"&mdash;this page has a links to both the Unicode data file and an explination of the encodings used in it), and</li> <li>possibly write a simple parser of this file to extract the information automatically (e.g. when there is a new version of Unicode)&mdash;the file is designed for simple programmatic parsing.</li> </ol> <p>Finally, note that solutions based on <code>ostringstream</code> and <code>istringstream</code> (this includes <code>boost::lexical_cast</code>) will not work, since the conversions used in streams are defined to only use the Roman digits. (On the other hand, it might be reasonable to restrict your code to just the Roman digits. In which case, the test becomes <code>if ( wch &gt;= L'0' &amp;&amp; wch &lt;= L'9' )</code>, and the conversion is done by simply subtracting <code>L'0'</code>&mdash; always supposing the the native encoding of wide character constants in your compiler is Unicode (the case, I'm pretty sure, of both VC++ and g++). Or just ensure that the locale is "C" (or "Posix", on a Unix machine).</p> <p>EDIT: I forgot to mention: if you're doing any serious Unicode programming, you should look into <a href="http://site.icu-project.org/" rel="noreferrer">ICU</a>. Handling Unicode correctly is extremely non-trivial, and they've a lot of functionality already implemented.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload