Note that there are some explanatory texts on larger screens.

plurals
  1. POUsing memory mapped files for in program temporary arrays?
    primarykey
    data
    text
    <p>I'm currently writing a program which shall be able to handle out of core data. So I'm processing files from the size starting at 1MB up to 50GB (and possibly larger in future).</p> <p>I have read several tutorials regarding memory mapped files and am now using the the memory mapped files for managing data IO , i.e. reading and writing data from/to the hard drive.</p> <p>Now I also process the data and need some temporary arrays of the same size as the data is. My question now is, if I should also use memory mapped files for that or if I should somehow get it managed by the OS without explicitly defining memory mapped files. The problem is as follows:</p> <p>I'm working on multiple platforms but always with 64bit systems. In theory, the 64bit virtual address space is definetly sufficient for my needs. However, in Windows the maximum virtual address space seems to be limited by the operating system, i.e. a user can set, if paging is allowed and which maximum virtual memory size is allowed. Also I read somewhere, that the maximum virtual memory in Windows 64 isn't 2^64 but somewhere by 2^40 or similar, which would still be sufficient for me, but seems to be a quite odd limitation. Furthermore, Windows has some strange limitations such as arrays with a maximum size of 2^31 elements, independent of the array type. I don't know how all of thisis handled on linux, but I think its treated similar. Probably the maximum allowed virtual memory=OS-RAM+Swap partition size? So there are a lot of things to struggle with if I want to use the system to handle my data exceeding the ram size. I don't even know if I can use in c++ the entire 64bit virtual address space somehow. In my short test, I got an compiler error not being able to initialze mot than 2^31 elements, but I think, it is easy to go beyond that by using std::vector and such.</p> <p>However, on the other hand, by using a memory mapped file, it will always be data written to the hdd with all my memory write operations. Especially for data which is smaller then my physical RAM, this is supposed to be a fairly huge bottleneck. Or does it avoid writing until it has to because the RAM is exceeded??? Memory mapped files advantages come up in inter process communications with shared memory or temporal communications such that I start the application, write something, quit the application and later restart it and read efficiently only those data to RAM which I need. As I need to process the entire data and only in one execution instance with one process, both advantages don't come up in my case.</p> <p>Note: A streaming approach as alternate solution to my problem is not really feasible as I heavily depend on random access to the data.</p> <p>What I ideally would like to have is a way that I can process all models independent of their size and operating limit set limitations but handle all whats possible in the RAM and only if the physical limit is exceeded, use memory mapped files or other mechanisms (if there are any others) for paging out the RAM exceeding data, ideally managed by the operating systemm.</p> <p>To conclude, whats the best approach to handle this temporary existing data? If I can do it without memory mapped files and platform independent, can you give me any code snippet or something like this and explain how it works to avoid these OS limitations?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload