Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Why bother with calculating how many items you can hold? How about letting java tell you when you've used up all your memory, catching the exception and continuing. For example,</p> <pre><code> // prepare output medium now so we don't need to worry about having enough // memory once the treeset has been filled. BufferedWriter writer = new BufferedWriter(new FileWriter("output")); Set&lt;?&gt; set = new TreeSet&lt;?&gt;(); int linesRead = 0; { BufferedReader reader = new BufferedReader(new FileReader("input")); try { String line = reader.readLine(); while (reader != null) { set.add(parseTuple(line)); linesRead += 1; line = reader.readLine(); } // end of file reached linesRead = -1; } catch (OutOfMemoryError e) { // while loop broken } finally { reader.close(); } // since reader and line were declared in a block their resources will // now be released } // output treeset to file for (Object o: set) { writer.write(o.toString()); } writer.close(); // use linesRead to find position in file for next pass // or continue on to next file, depending on value of linesRead </code></pre> <p>If you still have trouble with memory, just make the reader's buffer extra large so as to reserve more memory.</p> <p>The default size for the buffer in a BufferedReader is 4096 bytes. So when finishing reading you will release upwards of 4k of memory. After this your additional memory needs will be minimal. You need enough memory to create an iterator for the set, let's be generous and assume 200 bytes. You will also need memory to store the string output of your tuples (but only temporarily). You say the tuples contain about 200 characters. Let's double that to take account for separators -- 400 characters, which is 800 bytes. So all you really need is an additional 1k bytes. So you're fine as you've just released 4k bytes. </p> <p>The reason you don't need to worry about the memory used to store the string output of your tuples is because they are short lived and only referred to within the output for loop. Note that the Writer will copy the contents into its buffer and then discard the string. Thus, the next time the garbage collector runs the memory can be reclaimed. </p> <p>I've checked and, a OOME in <code>add</code> will not leave a TreeSet in an inconsistent state, and the memory allocation for a new <code>Entry</code> (the internal implementation for storing a key/value pair) happens before the internal representation is modified.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload