Note that there are some explanatory texts on larger screens.

plurals
  1. POCorrect way to serialize binary data in C++
    primarykey
    data
    text
    <p>After having read the following <a href="https://stackoverflow.com/questions/12612488/aliasing-t-with-char-is-allowed-is-it-also-allowed-the-other-way-around">1</a> and <a href="https://stackoverflow.com/questions/13280771/is-this-use-of-stdarray-undefined-behavior?">2</a> Q/As and having used the technique discussed below for many years on x86 architectures with GCC and MSVC and not seeing a problems, I'm now very confused as to what is supposed to be the correct but also as important "most efficient" way to serialize then deserialize binary data using C++.</p> <p>Given the following "wrong" code:</p> <pre><code>int main() { std::ifstream strm("file.bin"); char buffer[sizeof(int)] = {0}; strm.read(buffer,sizeof(int)); int i = 0; // Experts seem to think doing the following is bad and // could crash entirely when run on ARM processors: i = reinterpret_cast&lt;int*&gt;(buffer); return 0; } </code></pre> <p>Now as I understand things, the reinterpret cast indicates to the compiler that it can treat the memory at buffer as an integer and subsequently is free to issue integer compatible instructions which require/assume certain alignments for the data in question - with the only overhead being the extra reads and shifts when the CPU detects the address it is trying to execute alignment oriented instructions is actually not aligned. </p> <p>That said the answers provided above seem to indicate as far as C++ is concerned that this is all undefined behavior.</p> <p>Assuming that the alignment of the location in buffer from which cast will occur is not conforming, then is it true that the only solution to this problem is to copy the bytes 1 by 1? Is there perhaps a more efficient technique?</p> <p>Furthermore I've seen over the years many situations where a struct made up entirely of pods (using compiler specific pragmas to remove padding) is cast to a char* and subsequently written to a file or socket, then later on read back into a buffer and the buffer cast back to a pointer of the original struct, (ignoring potential endian and float/double format issues between machines), is this kind of code also considered undefined behaviour?</p> <p>The following is more complex example:</p> <pre><code>int main() { std::ifstream strm("file.bin"); char buffer[1000] = {0}; const std::size_t size = sizeof(int) + sizeof(short) + sizeof(float) + sizeof(double); const std::size_t weird_offset = 3; buffer += weird_offset; strm.read(buffer,size); int i = 0; short s = 0; float f = 0.0f; double d = 0.0; // Experts seem to think doing the following is bad and // could crash entirely when run on ARM processors: i = reinterpret_cast&lt;int*&gt;(buffer); buffer += sizeof(int); s = reinterpret_cast&lt;short*&gt;(buffer); buffer += sizeof(short); f = reinterpret_cast&lt;float*&gt;(buffer); buffer += sizeof(float); d = reinterpret_cast&lt;double*&gt;(buffer); buffer += sizeof(double); return 0; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload