Note that there are some explanatory texts on larger screens.

plurals
  1. POParsing text from CMemFile line by line
    primarykey
    data
    text
    <p>I have got a huge text file loaded into a <code>CMemFile</code> object and would like to parse it line by line (separated by newline chars).<br> Originally it is a zip file on disk, and I unzip it into memory to parse it, therefore the CMemFile.</p> <p>One working way to read line by line is this (m_file is a smart pointer to a <code>CMemFile</code>):</p> <pre><code> CArchive archive(m_file.get(), CArchive::load); CString line; while(archive.ReadString(line)) { ProcessLine(string(line)); } </code></pre> <p>Since it takes a lot of time, I tried to write my own routine:</p> <pre><code> const UINT READSIZE = 1024; const char NEWLINE = '\n'; char readBuffer[READSIZE]; UINT bytesRead = 0; char *posNewline = NULL; const char* itEnd = readBuffer + READSIZE; ULONGLONG currentPosition = 0; ULONGLONG newlinePositionInBuffer = 0; do { currentPosition = m_file-&gt;GetPosition(); bytesRead = m_file-&gt;Read(&amp;readBuffer, READSIZE); if(bytesRead == 0) break; // EOF posNewline = std::find(readBuffer, readBuffer + bytesRead, NEWLINE); if(posNewline != itEnd) { // found newline ProcessLine(string(readBuffer, posNewline)); newlinePositionInBuffer = posNewline - readBuffer + 1; // +1 to skip \r m_file-&gt;Seek(currentPosition + newlinePositionInBuffer, CFile::begin); } } while(true); </code></pre> <p>Measuring the performance showed both methods take about the same time...</p> <p><strong>Can you think of any performance improvements or a faster way to do the parsing?</strong></p> <p>Thanks for any advice</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload