Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Character traits are an extremely important component of the streams and strings libraries because they allow the stream/string classes to separate out the logic of <em>what characters are being stored</em> from the logic of <em>what manipulations should be performed on those characters.</em></p> <p>To begin with, the default character traits class, <code>char_traits&lt;T&gt;</code>, is used extensively in the C++ standard. For example, there is no class called <code>std::string</code>. Rather, there's a class template <code>std::basic_string</code> that looks like this:</p> <pre><code>template &lt;typename charT, typename traits = char_traits&lt;charT&gt; &gt; class basic_string; </code></pre> <p>Then, <code>std::string</code> is defined as</p> <pre><code>typedef basic_string&lt;char&gt; string; </code></pre> <p>Similarly, the standard streams are defined as</p> <pre><code>template &lt;typename charT, typename traits = char_traits&lt;charT&gt; &gt; class basic_istream; typedef basic_istream&lt;char&gt; istream; </code></pre> <p>So why are these classes structured as they are? Why should we be using a weird traits class as a template argument?</p> <p>The reason is that in some cases we might want to have a string just like <code>std::string</code>, but with some slightly different properties. One classic example of this is if you want to store strings in a way that ignores case. For example, I might want to make a string called <code>CaseInsensitiveString</code> such that I can have</p> <pre><code>CaseInsensitiveString c1 = "HI!", c2 = "hi!"; if (c1 == c2) { // Always true cout &lt;&lt; "Strings are equal." &lt;&lt; endl; } </code></pre> <p>That is, I can have a string where two strings differing only in their case sensitivity are compared equal.</p> <p>Now, suppose that the standard library authors designed strings without using traits. This would mean that I'd have in the standard library an immensely powerful string class that was entirely useless in my situation. I couldn't reuse much of the code for this string class, since comparisons would always work against how I wanted them to work. But by using traits, it's actually possible to reuse the code that drives <code>std::string</code> to get a case-insensitive string.</p> <p>If you pull up a copy of the C++ ISO standard and look at the definition of how the string's comparison operators work, you'll see that they're all defined in terms of the <code>compare</code> function. This function is in turn defined by calling</p> <pre><code>traits::compare(this-&gt;data(), str.data(), rlen) </code></pre> <p>where <code>str</code> is the string you're comparing to and <code>rlen</code> is the smaller of the two string lengths. This is actually quite interesting, because it means that the definition of <code>compare</code> directly uses the <code>compare</code> function exported by the traits type specified as a template parameter! Consequently, if we define a new traits class, then define <code>compare</code> so that it compares characters case-insensitively, we can build a string class that behaves just like <code>std::string</code>, but treats things case-insensitively!</p> <p>Here's an example. We inherit from <code>std::char_traits&lt;char&gt;</code> to get the default behavior for all the functions we don't write:</p> <pre><code>class CaseInsensitiveTraits: public std::char_traits&lt;char&gt; { public: static bool lt (char one, char two) { return std::tolower(one) &lt; std::tolower(two); } static bool eq (char one, char two) { return std::tolower(one) == std::tolower(two); } static int compare (const char* one, const char* two, size_t length) { for (size_t i = 0; i &lt; length; ++i) { if (lt(one[i], two[i])) return -1; if (lt(two[i], one[i])) return +1; } return 0; } }; </code></pre> <p>(Notice I've also defined <code>eq</code> and <code>lt</code> here, which compare characters for equality and less-than, respectively, and then defined <code>compare</code> in terms of this function).</p> <p>Now that we have this traits class, we can define <code>CaseInsensitiveString</code> trivially as</p> <pre><code>typedef std::basic_string&lt;char, CaseInsensitiveTraits&gt; CaseInsensitiveString; </code></pre> <p>And voila! We now have a string that treats everything case-insensitively!</p> <p>Of course, there are other reasons besides this for using traits. For example, if you want to define a string that uses some underlying character type of a fixed-size, then you can specialize <code>char_traits</code> on that type and then make strings from that type. In the Windows API, for example, there's a type <code>TCHAR</code> that is either a narrow or wide character depending on what macros you set during preprocessing. You can then make strings out of <code>TCHAR</code>s by writing</p> <pre><code>typedef basic_string&lt;TCHAR&gt; tstring; </code></pre> <p>And now you have a string of <code>TCHAR</code>s.</p> <p>In all of these examples, notice that we just defined some traits class (or used one that already existed) as a parameter to some template type in order to get a string for that type. The whole point of this is that the <code>basic_string</code> author just needs to specify how to use the traits and we magically can make them use our traits rather than the default to get strings that have some nuance or quirk not part of the default string type.</p> <p>Hope this helps!</p> <p><strong>EDIT</strong>: As @phooji pointed out, this notion of traits is not just used by the STL, nor is it specific to C++. As a completely shameless self-promotion, a while back I wrote <a href="http://www.keithschwarz.com/interesting/code/?dir=ternary-search-tree">an implementation of a ternary search tree</a> (a type of radix tree <a href="http://en.wikipedia.org/wiki/Ternary_search_tree">described here</a>) that uses traits to store strings of any type and using whatever comparison type the client wants them to store. It might be an interesting read if you want to see an example of where this is used in practice.</p> <p><strong>EDIT</strong>: In response to your claim that <code>std::string</code> doesn't use <code>traits::length</code>, it turns out that it does in a few places. Most notably, when you construct a <code>std::string</code> out of a <code>char*</code> C-style string, the new length of the string is derived by calling <code>traits::length</code> on that string. It seems that <code>traits::length</code> is used mostly to deal with C-style sequences of characters, which are the "least common denominator" of strings in C++, while <code>std::string</code> is used to work with strings of arbitrary contents.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload