Note that there are some explanatory texts on larger screens.

plurals
  1. POMapping non-contiguous blocks from a file into contiguous memory addresses
    text
    copied!<p>I am interested in the prospect of using memory mapped IO, preferably exploiting the facilities in boost::interprocess for cross-platform support, to map non-contiguous system-page-size blocks in a file into a contiguous address space in memory.</p> <p>A simplified concrete scenario:</p> <p>I've a number of 'plain-old-data' structures, each of a fixed length (less than the system page size.) These structures are concatenated into a (very long) stream with the type &amp; location of structures determined by the values of those structures that proceed them in the stream. I'm aiming to minimize latency and maximize throughput in a demanding concurrent environment.</p> <p>I can read this data very effectively by memory-mapping it in blocks of at least twice the system-page-size... and establishing a new mapping immediately having read a structure extending beyond the penultimate system-page-boundary. This allows the code that interacts with the plain-old-data structures to be blissfully unaware that these structures are memory mapped... and, for example, could compare two different structures using memcmp() directly without having to care about page boundaries.</p> <p>Where things get interesting is with respect to updating these data streams... while they're being (concurrently) read. The strategy I'd like to use is inspired by 'Copy On Write' on a system-page-size granularity... essentially writing 'overlay-pages' - allowing one process to read the old data while another reads the updated data.</p> <p>While managing which overlay pages to use, and when, isn't necessarily trivial... that's not my main concern. My main concern is that I may have a structure spanning pages 4 and 5, then update a structure wholly contained in page 5... writing the new page in location 6... leaving page 5 to be 'garbage collected' when it is determined to be no-longer reachable. This means that, if I map page 4 into location M, I need to map page 6 into memory location M+page_size... in order to be able to reliably process structures that cross page boundaries using existing (non-memory-mapping-aware) functions.</p> <p>I'm trying to establish the best strategy, and I'm hampered by documentation I feel is incomplete. Essentially, I need to decouple allocation of address space from memory mapping into that address space. With mmap(), I'm aware that I can use MAP_FIXED - if I wish to explicitly control the mapping location... but I'm unclear how I should reserve address space in order to do this safely. Can I map /dev/zero for two pages without MAP_FIXED, then use MAP_FIXED twice to map two pages into that allocated space at explicit VM addresses? If so, should I call munmap() three times too? Will it leak resources and/or have any other untoward overhead? To make the issue even more complex, I'd like comparable behaviour on Windows... is there any way to do this? Are there neat solutions if I were to compromise my cross-platform ambitions?</p> <p>--</p> <p>Thanks for your answer, Mahmoud... I've read, and think I've understood that code... I've compiled it under Linux and it behaves as you suggest.</p> <p>My main concerns are with line 62 - using MAP_FIXED. It makes some assumptions about mmap, which I've been unable to confirm when I read the documentation I can find. You're mapping the 'update' page into the same address space as mmap() returned initially - I assume that this is 'correct' - i.e. not something that just happens to work on Linux? I'd also need to assume that it works cross-platform for file-mappings as well as anonymous mappings.</p> <p>The sample definitely moves me forwards... documenting that what I ultimately need is probably achievable with mmap() on Linux - at least. What I'd really like is a pointer to documentation that shows that the MAP_FIXED line will work as the sample demonstrates... and, idealy, a transformation from the Linux/Unix specific mmap() to a platform independent (Boost::interprocess) approach.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload