Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>As a guess, Python strings are reference counted immutable strings, so that no strings are copied around in the Python code, while C++ <code>std::string</code> is a mutable value type, and is copied at the smallest opportunity.</p> <p>If the goal is fast splitting, then one would use constant time substring operations, which means only <em>referring</em> to parts of the original string, as in Python (and Java, and C#&hellip;).</p> <p>The C++ <code>std::string</code> class has one redeeming feature, though: it is <em>standard</em>, so that it can be used to pass strings safely and portably around where efficiency is not a main consideration. But enough chat. Code -- and on my machine this is of course faster than Python, since Python's string handling is implemented in C which is a subset of C++ (he he):</p> <pre><code>#include &lt;iostream&gt; #include &lt;string&gt; #include &lt;sstream&gt; #include &lt;time.h&gt; #include &lt;vector&gt; using namespace std; class StringRef { private: char const* begin_; int size_; public: int size() const { return size_; } char const* begin() const { return begin_; } char const* end() const { return begin_ + size_; } StringRef( char const* const begin, int const size ) : begin_( begin ) , size_( size ) {} }; vector&lt;StringRef&gt; split3( string const&amp; str, char delimiter = ' ' ) { vector&lt;StringRef&gt; result; enum State { inSpace, inToken }; State state = inSpace; char const* pTokenBegin = 0; // Init to satisfy compiler. for( auto it = str.begin(); it != str.end(); ++it ) { State const newState = (*it == delimiter? inSpace : inToken); if( newState != state ) { switch( newState ) { case inSpace: result.push_back( StringRef( pTokenBegin, &amp;*it - pTokenBegin ) ); break; case inToken: pTokenBegin = &amp;*it; } } state = newState; } if( state == inToken ) { result.push_back( StringRef( pTokenBegin, &amp;*str.end() - pTokenBegin ) ); } return result; } int main() { string input_line; vector&lt;string&gt; spline; long count = 0; int sec, lps; time_t start = time(NULL); cin.sync_with_stdio(false); //disable synchronous IO while(cin) { getline(cin, input_line); //spline.clear(); //empty the vector for the next line to parse //I'm trying one of the two implementations, per compilation, obviously: // split1(spline, input_line); //split2(spline, input_line); vector&lt;StringRef&gt; const v = split3( input_line ); count++; }; count--; //subtract for final over-read sec = (int) time(NULL) - start; cerr &lt;&lt; "C++ : Saw " &lt;&lt; count &lt;&lt; " lines in " &lt;&lt; sec &lt;&lt; " seconds." ; if (sec &gt; 0) { lps = count / sec; cerr &lt;&lt; " Crunch speed: " &lt;&lt; lps &lt;&lt; endl; } else cerr &lt;&lt; endl; return 0; } //compiled with: g++ -Wall -O3 -o split1 split_1.cpp -std=c++0x </code></pre> <p>Disclaimer: I hope there aren't any bugs. I haven't tested the functionality, but only checked the speed. But I think, even if there is a bug or two, correcting that won't significantly affect the speed.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload