Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Solution</strong></p> <p>Run the JVM with the <code>-XX:+UseCondCardMark</code> flag, available only in JDK7. This solves the problem.</p> <p><strong>Explanation</strong></p> <p>Essentially, most managed-heap environments use card tables to mark the areas of memory into which writes occurred. Such memory areas are marked as <em>dirty</em> in the card table once the write occurs. This information is needed for garbage collection - references of the non-dirty memory areas don't have to be scanned. A card is a contiguous block of memory, typically 512 bytes. A card table typically has 1 byte for each card - if this byte is set, the card is dirty. This means that a card table with 64 bytes covers 64 * 512 bytes of memory. And typically, the cache line size today is 64 bytes.</p> <p>So each time a write to an object field occurs, the byte of the corresponding card in the card table must be set as dirty. A useful optimization in single thread programs is to do this by simply marking the relevant byte - do a write each time. An alternative of first checking whether the byte is set and a conditional write requires an additional read and a conditional jump, which is slightly slower.</p> <p>However, this optimization can be catastrophic in the event that there are multiple processors writing to the memory, as neighbouring cards being written to require a write to neighbouring bytes in the card table. So the memory area being written to (the entry in the array above) is not in the same cache-line, which is the usual cause of memory contention. The real reason is that the dirty bytes which are written to are in the same cache line.</p> <p>What the above flag does is - it implements the card table dirty byte write by first checking if the byte is already set, and setting it only if it isn't. This way the memory contention happens only during the first write to that card - after that, only reads to that cache-line occur. Since the cache-line is only read, it can be replicated across multiple processors and they don't have to synchronize to read it.</p> <p>I've observed that this flag increases the running time some 15-20% in the 1-thread case.</p> <p>The <code>-XX:+UseCondCardMark</code> flag is explained in this <a href="http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card" rel="noreferrer">blog post</a> and this <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7029167" rel="noreferrer">bug report</a>.</p> <p>The relevant concurrency mailing list discussion: <a href="http://old.nabble.com/Array-allocation-and-access-on-the-JVM-to33203471.html" rel="noreferrer">Array allocation and access on the JVM</a>.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload