Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I seem to recall that the strings in Java were stored as the actual characters along with a start and length.</p> <p>This means that a substring string can share the same characters (since they're immutable) and only have to maintain a separate start and length.</p> <p>So I'm not entirely certain what your memory issues are with the Java strings.</p> <hr> <p>Regarding that article posted in your edit, it seems a bit of a non-issue to me.</p> <p>Unless you're in the habit of making huge strings, then taking a small substring of them and leaving those lying around, this will have near-zero impact on memory.</p> <p>Even if you had a 10M string and you made 400 substrings, you're only using that 10M for the underlying char array - it's not making 400 copies of that substring. The only memory impact is the start/length bit of each substring object.</p> <p>The author seems to be complaining that they read a huge string into memory then only wanted a bit of it, but the entire thing was kept - my suggestion would be they they might want to rethink how they process their data :-)</p> <p>To call this a Java bug is a huge stretch as well. A bug is something that doesn't work to specification. This was a <em>deliberate</em> design decision to improve performance, running out of memory because you don't understand how things work is not a bug, IMNSHO. And it's definitely <em>not</em> a memory leak.</p> <hr> <p>There was one <em>possible</em> good suggestion in the comments to that article, that the GC could more aggressively recover bits of unused strings by compressing them.</p> <p>This is <em>not</em> something you'd want to do on a first pass GC since it would be relatively expensive. However, where every other GC operation had failed to reclaim enough space, you could do it.</p> <p>Unfortunately it would almost certainly mean that the underlying <code>char</code> array would need to keep a record of all the string objects that referenced it, so it could both figure out what bits were unused <em>and</em> modify all the string object start and length fields.</p> <p>This in itself may introduce unacceptable performance impacts and, on top of that, if your memory is so short for this to be a problem, you may not even be able to allocate enough space for a smaller version of the string.</p> <p>I think, if the memory's running out, I'd probably prefer <em>not</em> to be maintaining this char-array-to-string mapping to make this level of GC possible, instead I would prefer that memory to be used for my strings.</p> <hr> <p>Since there is a perfectly acceptable workaround, and good coders should know about the foibles of their language of choice, I suspect the author is right - it <em>won't</em> be fixed.</p> <p>Not because the Java developers are too lazy, but because it's not a problem.</p> <p>You're free to implement your <em>own</em> string methods which match the C# ones (which don't share the underlying data except in certain limited scenarios). This will fix your memory problems but at the cost of a performance hit, since you have to copy the data every time you call substring. As with most things in IT (and life), it's a trade-off.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload