Note that there are some explanatory texts on larger screens.

plurals
  1. POCorrect, portable way to interpret buffer as a struct
    primarykey
    data
    text
    <p>The context of my problem is in network programming. Say I want to send messages over the network between two programs. For simplicity, let's say messages look like this, and byte-order is not a concern. I want to find a correct, portable, and efficient way to define these messages as C structures. I know of four approaches to this: explicit casting, casting through a union, copying, and marshaling.</p> <pre><code>struct message { uint16_t logical_id; uint16_t command; }; </code></pre> <h2>Explicit Casting:</h2> <pre><code>void send_message(struct message *msg) { uint8_t *bytes = (uint8_t *) msg; /* call to write/send/sendto here */ } void receive_message(uint8_t *bytes, size_t len) { assert(len &gt;= sizeof(struct message); struct message *msg = (struct message*) bytes; /* And now use the message */ if (msg-&gt;command == SELF_DESTRUCT) /* ... */ } </code></pre> <p>My understanding is that <code>send_message</code> does not violate aliasing rules, because a byte/char pointer may alias any type. However, the converse is not true, and so <code>receive_message</code> violates aliasing rules and thus has undefined behavior.</p> <h2>Casting Through a Union:</h2> <pre><code>union message_u { struct message m; uint8_t bytes[sizeof(struct message)]; }; void receive_message_union(uint8_t *bytes, size_t len) { assert(len &gt;= sizeof(struct message); union message_u *msgu = bytes; /* And now use the message */ if (msgu-&gt;m.command == SELF_DESTRUCT) /* ... */ } </code></pre> <p>However, this seems to violate the idea that a union only contains one of its members at any given time. Additionally, this seems like it could lead to alignment issues if the source buffer isn't aligned on a word/half-word boundary.</p> <h2>Copying:</h2> <pre><code>void receive_message_copy(uint8_t *bytes, size_t len) { assert(len &gt;= sizeof(struct message); struct message msg; memcpy(&amp;msg, bytes, sizeof msg); /* And now use the message */ if (msg.command == SELF_DESTRUCT) /* ... */ } </code></pre> <p>This seems guaranteed to produce the correct result, but of course I would greatly prefer to not have to copy the data.</p> <h2>Marshaling</h2> <pre><code>void send_message(struct message *msg) { uint8_t bytes[4]; bytes[0] = msg.logical_id &gt;&gt; 8; bytes[1] = msg.logical_id &amp; 0xff; bytes[2] = msg.command &gt;&gt; 8; bytes[3] = msg.command &amp; 0xff; /* call to write/send/sendto here */ } void receive_message_marshal(uint8_t *bytes, size_t len) { /* No longer relying on the size of the struct being meaningful */ assert(len &gt;= 4); struct message msg; msg.logical_id = (bytes[0] &lt;&lt; 8) | bytes[1]; /* Big-endian */ msg.command = (bytes[2] &lt;&lt; 8) | bytes[3]; /* And now use the message */ if (msg.command == SELF_DESTRUCT) /* ... */ } </code></pre> <p>Still have to copy, but now decoupled from the representation of the struct. But now we need be explicit with the position and size of each member, and endian-ness is a much more obvious issue.</p> <h1>Related info:</h1> <p><a href="https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule">What is the strict aliasing rule?</a></p> <p><a href="https://stackoverflow.com/questions/17007146/aliasing-array-with-pointer-to-struct-without-violating-the-standard">Aliasing array with pointer-to-struct without violating the standard</a></p> <p><a href="https://stackoverflow.com/questions/262379/when-is-char-safe-for-strict-pointer-aliasing?rq=1">When is char* safe for strict pointer aliasing?</a></p> <p><a href="http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html" rel="noreferrer">http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html</a></p> <h1>Real World Example</h1> <p>I've been looking for examples of networking code to see how this situation is handled elsewhere. The <a href="http://savannah.nongnu.org/projects/lwip/" rel="noreferrer">light-weight ip</a> has a few similar cases. In the <a href="http://git.savannah.gnu.org/cgit/lwip.git/tree/src/core/udp.c" rel="noreferrer">udp.c</a> file lies the following code:</p> <pre><code>/** * Process an incoming UDP datagram. * * Given an incoming UDP datagram (as a chain of pbufs) this function * finds a corresponding UDP PCB and hands over the pbuf to the pcbs * recv function. If no pcb is found or the datagram is incorrect, the * pbuf is freed. * * @param p pbuf to be demultiplexed to a UDP PCB (p-&gt;payload pointing to the UDP header) * @param inp network interface on which the datagram was received. * */ void udp_input(struct pbuf *p, struct netif *inp) { struct udp_hdr *udphdr; /* ... */ udphdr = (struct udp_hdr *)p-&gt;payload; /* ... */ } </code></pre> <p>where <code>struct udp_hdr</code> is a packed representation of a udp header and <code>p-&gt;payload</code> is of type <code>void *</code>. Going on my understanding and <a href="https://stackoverflow.com/a/15745083/1993996">this</a> answer, this is <strong>definitely</strong> [edit- not] breaking strict-aliasing and thus has undefined behavior.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload