Note that there are some explanatory texts on larger screens.

plurals
  1. POHow should I handle decompression of a large (70MB uncompressed) byte stream without overflowing heap?
    primarykey
    data
    text
    <p>I'm working on implementing GZIP compression for interactions between some of our systems. The systems are written in both Java and C#, so GZIP streams were used on both sides since they have standard library support.</p> <p>On the C# side, everything works up to and including our biggest test files (70MB uncompressed), however we run into issues with Java running out of heap space. We've tried increasing the heap size to capacity for the IDE, but the issue is still not resolved.</p> <p>I've taken some steps to try and optimize the Java code, but nothing seems to keep the data from piling up in the heap. Is there a good way to handle this? Below is a subset of my current (working on smaller streams) solution.</p> <p><em>EDIT: Following code modified with recommendations from @MarkoTopolnik. With changes, 17 million characters are read before crash.</em></p> <pre><code>public static String decompress(byte[] compressed, int size) { GZIPInputStream decompresser; BufferedReader reader; char buf[] = new char[(size &lt; 2048) ? size : 2048]; Writer ret = new StringWriter( buf.length ); decompresser = new GZIPInputStream( new ByteArrayInputStream( compressed ), buf.length ); reader = new BufferedReader( new InputStreamReader( decompresser, "UTF-8" ) ); int charsRead; while( (charsRead = reader.read( buf, 0, buf.length )) != -1 ) { ret.write( buf, 0, charsRead ); } decompresser.close(); reader.close(); return ret.toString(); } </code></pre> <p><strike>The code dies after hitting a little over 7.6 million chars in the <code>ArrayList</code> and the stack trace indicates that the <code>ArrayList.add()</code> call is the cause (fails after triggering the internal array to be expanded).</strike></p> <p>With the edited code above, a call to <code>AbstractStringBuilder.expandCapacity()</code> is what kills the program.</p> <p>Is there a less memory-expensive way to implement a dynamic array or some completely different approach I can use to get a String from the decompressed stream? Any suggestions would be greatly appreciated!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload