Note that there are some explanatory texts on larger screens.

plurals
  1. POGPU-accelerated sort (~1GB) and merge sort (~100GB)
    primarykey
    data
    text
    <p>I'm asking for a c++ library to do GPU-accelerated sort (around 1GB of data) and merge sort (say, around 100GB of data &mdash; but the size do not matter, because merge is a stream algorithm). License have to be LGPL, BSD or like this. I greatly prefer OpenCL because of portability (but I also interested in links to CUDA libraries). I appreciate links to papers and blog posts on this subject.</p> <h3> Some background (please correct me if I wrong): </h3> <p>2-way merge sort of 1GB (that is, 128 000 000 of 8-bytes entities) will consume approximately log<sub>2</sub>(128 000 000)&middot;1G = 27GB of memory bandwidth, that is around 1 second on modern CPU with <em>sequential</em> memory bandwidth of ~30GB/s. (Any non-merge sort seems to take much longer time, because non-sequential memory access is slower in 10-100 times).</p> <p>Although I am not familiar with modern GPU, I suspect that merge sort of 1GB will take 0.2 second or even less, because typical GPU memory bandwidth is around 150GB/s, like AMD/ATI 58xx (see, for example <a href="http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units#Evergreen_.28HD_5xxx.29_series" rel="nofollow noreferrer">http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units#Evergreen_.28HD_5xxx.29_series</a>) </p> <p>That is at least 5x speedup. (The time to transfer 1GB over 16x PCI-E 2.0 is around 0.125s, but it seems possible to make PCI transfers in parallel with sorting; however, this may require 2GB or 3GB of video-card memory instead of 1GB).</p> <p>I suspect even more speed-up due to more-than-2-way merge sort or some sort, suitable for GPU.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload