StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Java <code>char</code>s are 2 bytes (16 bits unsigned) in size. So if you want 2MB you need one million characters. There are two obvious issues with your code:</p> <ol> <li>Repeatedly calling <code>length()</code> is unnecessary. Add any character to a Java <code>String</code> and it's length goes up by 1, regardless of what the character is. Perhaps you're confusing this with the size in bytes. It doesn't mean that; and</li> <li>You have huge memory fragmentation issues with that code.</li> </ol> <p>To further explain (2), the String concatenation operator (<code>+</code>) in Java causes a new <code>String</code> to be created because Java <code>String</code>s are immutable. So:</p> <pre><code>String a = "a"; a += "b"; </code></pre> <p>actually means:</p> <pre><code>String a = "a"; String a = a + "b"; </code></pre> <p>This sometimes confuses former C++ programmers as strings work differently in C++.</p> <p>So your code is actually allocating a million strings for a message size of one million. Only the last one is kept. The others are garbage that will be cleaned up but there is no need for it.</p> <p>A better version is:</p> <pre><code>private static String createDataSize(int msgSize) { StringBuilder sb = new StringBuilder(msgSize); for (int i=0; i<msgSize; i++) { sb.append('a'); } return sb.toString(); } </code></pre> <p>The key difference is that:</p> <ol> <li>A <code>StringBuilder</code> is mutable so doesn't need to be reallocated with each change; and</li> <li>The <code>StringBuilder</code> is preallocated to the right size in this code sample.</li> </ol> <p><strong>Note:</strong> the astute may have noticed I've done:</p> <pre><code>sb.append('a'); </code></pre> <p>rather than:</p> <pre><code>sb.append("a"); </code></pre> <p><code>'a'</code> of course is a single character, <code>"a"</code> is a <code>String</code>. You could use either in this case.</p> <p>However, it's not that simple because it depends on how the bytes are encoded. Typically unless you specify it otherwise it'll use UTF8, which is variable width characters. So one million characters might be anywhere from 1MB to 4MB in size depending on you end up encoding it and your question doesn't contain details of that.</p> <p>If you need data of a specific size and that data doesn't matter, my advice would be to simply use a <code>byte</code> array of the right size.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload