Note that there are some explanatory texts on larger screens.

plurals
  1. POFancy way to read a file in C++ : strange performance issue
    primarykey
    data
    text
    <p>The usual way to read a file in C++ is this one:</p> <pre><code>std::ifstream file("file.txt", std::ios::binary | std::ios::ate); std::vector&lt;char&gt; data(file.tellg()); file.seekg(0, std::ios::beg); file.read(data.data(), data.size()); </code></pre> <p>Reading a 1.6 MB file is almost instant.</p> <p>But recently, I discovered <a href="http://cplusplus.com/reference/std/iterator/istream_iterator/" rel="noreferrer">std::istream_iterator</a> and wanted to try it in order to code a beautiful one-line way to read the content of a file. Like this:</p> <pre><code>std::vector&lt;char&gt; data(std::istream_iterator&lt;char&gt;(std::ifstream("file.txt", std::ios::binary)), std::istream_iterator&lt;char&gt;()); </code></pre> <p>The code is nice, but <strong>very</strong> slow. It takes about 2/3 seconds to read the same 1.6 MB file. I understand that it may not be the best way to read a file, but why is it <strong>so</strong> slow?</p> <p>Reading a file in a classical way goes like this (I'm talking only about the read function):</p> <ul> <li>the istream contains a <a href="http://cplusplus.com/reference/iostream/filebuf/" rel="noreferrer">filebuf</a> which contains a block of data from the file</li> <li>the read function calls <a href="http://cplusplus.com/reference/iostream/streambuf/sgetn/" rel="noreferrer">sgetn</a> from the filebuf, which copies the chars one by one (no memcpy) from the inside buffer to "data"'s buffer</li> <li>when the data inside of the filebuf is entirely read, the filebuf reads the next block from the file</li> </ul> <p>When you read a file using istream_iterator, it goes like this:</p> <ul> <li>the vector calls *iterator to get the next char (this simply reads a variable), adds it to the end and increases its own size</li> <li>if the vector's allocated space is full (which happens not so often), a relocation is performed</li> <li>then it calls ++iterator which reads the next char from the stream (operator >> with a char parameter, which certainly just calls the filebuf's sbumpc function)</li> <li>finally it compares the iterator with the end iterator, which is done by comparing two pointers</li> </ul> <p>I must admit that the second way is not very efficient, but it's at least 200 times slower than the first way, how is that possible?</p> <p>I thought that the performance killer was the relocations or the insert, but I tried creating an entire vector and calling std::copy, and it's just as slow.</p> <pre><code>// also very slow: std::vector&lt;char&gt; data2(1730608); std::copy(std::istream_iterator&lt;char&gt;(std::ifstream("file.txt", std::ios::binary)), std::istream_iterator&lt;char&gt;(), data2.begin()); </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload