Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable?
    primarykey
    data
    text
    <p>Over the past year I've made huge improvements in my application's Java heap usage--a solid 66% reduction. In pursuit of that, I've been monitoring various metrics, such as Java heap size, cpu, Java non-heap, etc. via SNMP. </p> <p>Recently, I've been monitoring how much real memory (RSS, resident set) by the JVM and am somewhat surprised. The real memory consumed by the JVM seems totally independent of my applications heap size, non-heap, eden space, thread count, etc.</p> <p><strong>Heap Size as measured by Java SNMP</strong> <a href="http://lanai.dietpizza.ch/images/jvm-heap-used.png" rel="nofollow noreferrer">Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-heap-used.png</a></p> <p><strong>Real Memory in KB. (E.g.: 1 MB of KB = 1 GB)</strong> <a href="http://lanai.dietpizza.ch/images/jvm-rss.png" rel="nofollow noreferrer">Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-rss.png</a></p> <p><em>(The three dips in the heap graph correspond to application updates/restarts.)</em></p> <p>This is a problem for me because all that extra memory the JVM is consuming is 'stealing' memory that could be used by the OS for file caching. In fact, once the RSS value reaches ~2.5-3GB, I start to see slower response times and higher CPU utilization from my application, mostly do to IO wait. As some point paging to the swap partition kicks in. This is all very undesirable. </p> <p><strong>So, my questions:</strong></p> <ul> <li><strong>Why is this happening? What is going on <em>"under the hood"</em>?</strong></li> <li><strong>What can I do to keep the JVM's real memory consumption in check?</strong></li> </ul> <p>The gory details:</p> <ul> <li>RHEL4 64-bit (Linux - 2.6.9-78.0.5.ELsmp #1 SMP Wed Sep 24 ... 2008 x86_64 ... GNU/Linux)</li> <li>Java 6 (build 1.6.0_07-b06)</li> <li>Tomcat 6</li> <li>Application (on-demand HTTP video streaming) <ul> <li>High I/O via java.nio FileChannels</li> <li>Hundreds to low thousands of threads</li> <li>Low database use</li> <li>Spring, Hibernate</li> </ul></li> </ul> <p>Relevant JVM parameters:</p> <pre><code>-Xms128m -Xmx640m -XX:+UseConcMarkSweepGC -XX:+AlwaysActAsServerClassMachine -XX:+CMSIncrementalMode -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+CMSLoopWarn -XX:+HeapDumpOnOutOfMemoryError </code></pre> <p>How I measure RSS:</p> <pre><code>ps x -o command,rss | grep java | grep latest | cut -b 17- </code></pre> <p>This goes into a text file and is read into an RRD database my the monitoring system on regular intervals. Note that ps outputs Kilo Bytes.</p> <hr> <h1>The Problem &amp; Solution<em>s</em>:</h1> <p>While in the end it was <strong><a href="https://stackoverflow.com/users/97799/atorras">ATorras</a></strong>'s answer that proved ultimately correct, it <strong><a href="https://stackoverflow.com/users/42126/kdgregory">kdgregory</a></strong> who guided me to the correct diagnostics path with the use of <code>pmap</code>. (Go vote up both their answers!) Here is what was happening:</p> <p><strong><em>Things I know for sure:</em></strong></p> <ol> <li>My application records and displays data with <a href="http://oldwww.jrobin.org/" rel="nofollow noreferrer">JRobin 1.4</a>, something I coded into my app over three years ago.</li> <li>The busiest instance of the application currently creates <ol> <li>Over 1000 a few new JRobin database files (at about 1.3MB each) within an hour of starting up</li> <li>~100+ each day after start-up</li> </ol></li> <li>The app updates these JRobin data base objects once every 15s, if there is something to write.</li> <li>In the default configuration JRobin: <ol> <li>uses a <code>java.nio</code>-based file access back-end. This back-end maps <code>MappedByteBuffers</code> to the files themselves.</li> <li>once every five minutes a JRobin daemon thread calls <code>MappedByteBuffer.force()</code> on every JRobin underlying database MBB</li> </ol></li> <li><code>pmap</code> listed: <ol> <li>6500 mappings</li> <li>5500 of which were 1.3MB JRobin database files, which works out to ~7.1GB</li> </ol></li> </ol> <p>That last point was my <em>"Eureka!"</em> moment.</p> <p><strong><em>My corrective actions:</em></strong></p> <ol start="2"> <li>Consider updating to the latest JRobinLite 1.5.2 which is apparently better</li> <li>Implement proper resource handling on JRobin databases. At the moment, once my application creates a database and then never dumps it after the database is no longer actively used.</li> <li>Experiment with moving the <code>MappedByteBuffer.force()</code> to database update events, and not a periodic timer. Will the problem magically go away?</li> <li><strong>Immediately</strong>, change the JRobin back-end to the java.io implementation--a line line change. This will be slower, but it is possibly not an issue. Here is a graph showing the immediate impact of this change. </li> </ol> <p><a href="http://lanai.dietpizza.ch/images/stackoverflow-rss-problem-fixed.png" rel="nofollow noreferrer">Java RSS memory used graph http://lanai.dietpizza.ch/images/stackoverflow-rss-problem-fixed.png</a></p> <p><strong><em>Questions that I may or may not have time to figure out:</em></strong></p> <ul> <li>What is going on inside the JVM with <code>MappedByteBuffer.force()</code>? If nothing has changed, does it still write the entire file? Part of the file? Does it load it first?</li> <li>Is there a certain amount of the MBB always in RSS at all times? (RSS was roughly half the total allocated MBB sizes. Coincidence? I suspect not.)</li> <li>If I move the <code>MappedByteBuffer.force()</code> to database update events, and not a periodic timer, will the problem magically go away?</li> <li>Why was the RSS slope so regular? It does not correlate to any of the application load metrics.</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COThanks for the additional info -- it does clarify that you're in an unusual situation. I think *ATorras* is on the right track, so won't be making more edits to my answer (which may be useful to people in less-unusual situations). Unfortunately, unless you're not closing your channels, I suspect that the only solution is to scale horizontally. Or, I suppose, add more physical memory, but that will eventually run out as well.
      singulars
    2. COInitially I thought ATorras was on the correct track too, but then it hit me that I would expect a correlation between server activity and the trajectory of the RSS size. There is none. In fact, it is amazingly steady. *"Things that make you go hmmmm...."*
      singulars
    3. COOK, one more suggestion: take a daily (or twice-daily) *pmap* of the process, from restart forward, and look for differences. These files will be huge, and most of the output will be "anon" blocks representing memory-mapped segments. I'd expect those "anon" blocks to be grouped by size: 1/2 meg for thread stacks, and some other value for file channels. The diffs will at least give you an idea what's consuming your virtual map, and that should lead to what's staying resident.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload