Note that there are some explanatory texts on larger screens.

plurals
  1. POIs there a fast in-memory queue I can use that swaps items as it reaches a certain size?
    primarykey
    data
    text
    <p>I've been using c/c++/cuda for less than a week and not familiar with all the options available in terms of libraries(sorry if my question is too wacky or impossible). Here's my problem, I have a process that takes data and analyzes it then does 1 of 3 things, (1) saves the results, (2) discards the results or (3) breaks the data down and sends it back to be processed. </p> <p>Often option (3) creates a lot of data and I very quickly exceed the memory available to me(my server is 16 gigs) so the way I got around that was to setup a queue server(rabbitmq) that I would send and receive work from(it swaps the queue once it reaches a certain size of memory). This worked perfectly when I used small servers with faster nics to transfer the data, but lately I have been learning and converting my code from Java to c/c++ and running it on a GPU which has made the queues a big bottleneck. The bottleneck was obviously the network io(profiling on cheap systems showed high cpu usage and similar on old gpu's but new faster cpus/gpus are not getting utilized as much and network IO is steady at 300-400/mbs). So I decided to try to eliminate the network totally and run the queue server locally on the server which made it faster but I suspect it could be even more faster if I used a solution that didn't rely on external network services(even if I am running them locally). It may not work but I want to experiment.</p> <p>So my question is, is there anything that I can use like a queue that I can remove entries as I read them but also swaps the queue to disk once it reaches a certain size(but keeps the in-memory queue always full so I don't have to wait to read from disk)? When learning about Cuda, there are many examples of researchers running analysis on huge datasets, any ideas of how they keep data coming in at the fastest rate for the system to process(I imagine they aren't bound by disk/network otherwise faster gpu's wouldn't really give them magnitudes increase in performance)? </p> <p>Does anything like this exist?</p> <p>p.s. if it helps, so far I have experimented with rabbitmq(too slow for my situation), apollo mq(good but still network based), reddis(really liked it but cannot exceed physical memory), playing with mmap(), and I've also compressed my data to get better throughput. I know general solutions but I'm wondering if there's something native to c/c++, cuda or a library I can use(<em>ideally, I would have a queue in Cuda global memory that swapped to the host memory that swapped to the disk so the GPU's would always be at full speed</em> but that maybe wishful thinking). If there's anything else you can think of let me know and I'd enjoy experimenting with it(if it helps, I develop on a Mac and run it on linux).</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload