Note that there are some explanatory texts on larger screens.

plurals
  1. POTroubleshooting unbounded Java Resident Set Size(RSS) growth
    primarykey
    data
    text
    <p>I have a standalone Java application which has:</p> <pre><code>-Xmx1024m -Xms1024m -XX:MaxPermSize=256m -XX:PermSize=256m </code></pre> <p>Over the course of time it hogs more and more memory, starts to swap(and slow down) and eventually died a number of times(not OOM+dump, just died, nothing on /var/log/messages).</p> <p>What I've tried so far:</p> <ol> <li>Heap dumps: live objects take 200-300Mb out of 1G heap --> ok with heap</li> <li>Number of live threads is rather constant(~60-70) --> ok with thread stacks</li> <li>JMX stops answering at some point(mb it answers but timeout is lower)</li> <li>Turn off swap - it dies faster</li> <li>strace - seems everything slows down a bit, app still haven't died, and not sure for which things look there</li> <li>Checking top: VIRT grows to 5.5Gb, RSS to 3.7 Gb</li> <li><p>Checking vmstat(obviously we start to swap):</p> <pre><code> --------------------------procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ Sun Jul 22 16:10:26 2012: r b swpd free buff cache si so bi bo in cs us sy id wa st Sun Jul 22 16:48:41 2012: 0 0 138652 2502504 40360 706592 1 0 169 21 1047 206 20 1 74 4 0 . . . Sun Jul 22 18:10:59 2012: 0 0 138648 24816 58600 1609212 0 0 124 669 913 24436 43 22 34 2 0 Sun Jul 22 19:10:22 2012: 33 1 138644 33304 4960 1107480 0 0 100 536 810 19536 44 22 23 10 0 Sun Jul 22 20:10:28 2012: 54 1 213916 26928 2864 578832 3 360 100 710 639 12702 43 16 30 11 0 Sun Jul 22 21:10:43 2012: 0 0 629256 26116 2992 467808 84 176 278 1320 1293 24243 50 19 29 3 0 Sun Jul 22 22:10:55 2012: 4 0 772168 29136 1240 165900 203 94 435 1188 1278 21851 48 16 33 2 0 Sun Jul 22 23:10:57 2012: 0 1 2429536 26280 1880 169816 6875 6471 7081 6878 2146 8447 18 37 1 45 0 </code></pre></li> <li><p>sar also shows steady system% growth = swapping:</p> <pre><code> 15:40:02 CPU %user %nice %system %iowait %steal %idle 17:40:01 all 51.00 0.00 7.81 3.04 0.00 38.15 19:40:01 all 48.43 0.00 18.89 2.07 0.00 30.60 20:40:01 all 43.93 0.00 15.84 5.54 0.00 34.70 21:40:01 all 46.14 0.00 15.44 6.57 0.00 31.85 22:40:01 all 44.25 0.00 20.94 5.43 0.00 29.39 23:40:01 all 18.24 0.00 52.13 21.17 0.00 8.46 12:40:02 all 22.03 0.00 41.70 15.46 0.00 20.81 </code></pre></li> <li><p>Checking pmap gaves the following largest contributors:</p> <pre><code> 000000005416c000 1505760K rwx-- [ anon ] 00000000b0000000 1310720K rwx-- [ anon ] 00002aaab9001000 2079748K rwx-- [ anon ] </code></pre></li> <li><p>Trying to correlate addresses I've got from pmap from stuff dumped by strace gave me no matches</p></li> <li><p>Adding more memory is not practical(just make problem appear later)</p></li> <li>Switching JVM's is not possible(env is not under our control)</li> </ol> <p>And the question is: <strong>What else can I try to track down the problem's cause or try to work around it?</strong></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload