Note that there are some explanatory texts on larger screens.

plurals
  1. POString class based on graphemes?
    primarykey
    data
    text
    <p>I'm wondering why we don't have some string classes that represent a string of Unicode grapheme clusters instead of code points or characters. It seems to me that in most applications it would be easier for programmers to access components of a grapheme when necessary than to have to organize them from code points, which appears necessary even if only to avoid casually breaking a string in "mid-grapheme" (at least in theory). Internally a string class might use a variable length encoding such as UTF-8, UTF-16, or in this context even UTF-32 is variable length; or implement subclasses for all of them (and optionally configure the choice at run-time so that different languages could use their optimal encodings). But if programmers could "see" grapheme units when inspecting a string, wouldn't string handling code in general be closer to achieving correctness, and without much extra complexity?</p> <p>References:<br> <a href="http://unicode.org/faq/char_combmark.html#7" rel="nofollow noreferrer">Characters and Combining Marks</a><br> <a href="http://useless-factor.blogspot.com/2007/08/unicode-implementers-guide-part-4.html" rel="nofollow noreferrer">Unicode implementer's guide part 4: grapheme breaking</a><br> <a href="http://icu-project.org/apiref/icu4c/classUnicodeString.html" rel="nofollow noreferrer">UnicodeString Class Reference</a><br> <a href="https://stackoverflow.com/questions/2056866/enumerating-a-string-by-grapheme-instead-of-character">Enumerating a string by grapheme instead of character</a><br> <a href="https://stackoverflow.com/questions/3950588/strings-and-character-encoding-in-c">Strings and character encoding in C++</a> </p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload