Note that there are some explanatory texts on larger screens.

plurals
  1. POJava Refuses to Start - Could not reserve enough space for object heap
    primarykey
    data
    text
    <h2>Background</h2> <p>We have a pool of aproximately 20 linux blades. Some are running Suse, some are running Redhat. ALL share NAS space which contains the following 3 folders:</p> <ul> <li>/NAS/app/java - a symlink that points to an installation of a Java JDK. Currently version 1.5.0_10</li> <li>/NAS/app/lib - a symlink that points to a version of our application.</li> <li>/NAS/data - directory where our output is written</li> </ul> <p>All our machines have 2 processors (hyperthreaded) with 4gb of physical memory and 4gb of swap space. We limit the number of 'jobs' each machine can process at a given time to 6 (this number likely needs to change, but that does not enter into the current problem so please ignore it for the time being).</p> <p>Some of our jobs set a Max Heap size of 512mb, some others reserve a Max Heap size of 2048mb. Again, we realize we could go over our available memory if 6 jobs started on the same machine with the heap size set to 2048, but to our knowledge this has not yet occurred.</p> <h2>The Problem</h2> <p>Once and a while a Job will fail immediately with the following message:</p> <pre><code>Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. </code></pre> <p>We used to chalk this up to too many jobs running at the same time on the same machine. The problem happened infrequently enough (<em>MAYBE</em> once a month) that we'd just restart it and everything would be fine.</p> <p>The problem has recently gotten much worse. All of our jobs which request a max heap size of 2048m fail immediately almost every time and need to get restarted several times before completing.</p> <p>We've gone out to individual machines and tried executing them manually with the same result.</p> <h2>Debugging</h2> <p>It turns out that the problem only exists for our SuSE boxes. The reason it has been happening more frequently is becuase we've been adding more machines, and the new ones are SuSE. </p> <p>'cat /proc/version' on the SuSE boxes give us:</p> <pre><code>Linux version 2.6.5-7.244-bigsmp (geeko@buildhost) (gcc version 3.3.3 (SuSE Linux)) #1 SMP Mon Dec 12 18:32:25 UTC 2005 </code></pre> <p>'cat /proc/version' on the RedHat boxes give us:</p> <pre><code>Linux version 2.4.21-32.0.1.ELsmp (bhcompile@bugs.build.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-52)) #1 SMP Tue May 17 17:52:23 EDT 2005 </code></pre> <p>'uname -a' gives us the following on BOTH types of machines:</p> <pre><code>UTC 2005 i686 i686 i386 GNU/Linux </code></pre> <p>No jobs are running on the machine, and no other processes are utilizing much memory. All of the processes currently running <em>might</em> be using 100mb total.</p> <p>'top' currently shows the following:</p> <pre><code>Mem: 4146528k total, 3536360k used, 610168k free, 132136k buffers Swap: 4194288k total, 0k used, 4194288k free, 3283908k cached </code></pre> <p>'vmstat' currently shows the following:</p> <pre><code>procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 610292 132136 3283908 0 0 0 2 26 15 0 0 100 0 </code></pre> <p>If we kick off a job with the following command line (Max Heap of 1850mb) it starts fine:</p> <pre><code>java/bin/java -Xmx1850M -cp helloworld.jar HelloWorld Hello World </code></pre> <p>If we bump up the max heap size to 1875mb it fails:</p> <pre><code>java/bin/java -Xmx1875M -cp helloworld.jar HelloWorld Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. </code></pre> <p>It's quite clear that the memory currently being used is for Buffering/Caching and that's why so little is being displayed as 'free'. What isn't clear is why there is a magical 1850mb line where anything higher means Java can't start.</p> <p>Any explanations would be greatly appreciated.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload