Note that there are some explanatory texts on larger screens.

plurals
  1. POJava Large Files Disk IO Performance
    text
    copied!<p>I have two (2GB each) files on my harddisk and want to compare them with each other:</p> <ul> <li>Copying the original files with Windows explorer takes approx. 2-4 minutes (that is reading and writing - on the same physical and logical disk).</li> <li>Reading with <code>java.io.FileInputStream</code> twice and comparing the byte arrays on a byte per byte basis takes 20+ minutes.</li> <li><code>java.io.BufferedInputStream</code> buffer is 64kb, the files are read in chunks and then compared.</li> <li><p>Comparison is done is a tight loop like</p> <pre><code>int numRead = Math.min(numRead[0], numRead[1]); for (int k = 0; k &lt; numRead; k++) { if (buffer[1][k] != buffer[0][k]) { return buffer[0][k] - buffer[1][k]; } } </code></pre></li> </ul> <p>What can I do to speed this up? Is NIO supposed to be faster then plain streams? Is Java unable to use DMA/SATA technologies and does some slow OS-API calls instead?</p> <p><strong>EDIT:</strong><br> Thanks for the answers. I did some experiments based on them. As Andreas showed</p> <blockquote> <p>streams or <code>nio</code> approaches do not differ much.<br> More important is the correct buffer size.</p> </blockquote> <p>This is confirmed by my own experiments. As the files are read in big chunks, even additional buffers (<code>BufferedInputStream</code>) do not give anything. Optimising the comparison is possible and I got the best results with 32-fold unrolling, but the time spend in comparison is small compared to disk read, so the speedup is small. Looks like there is nothing I can do ;-(</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload