StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>There is no way to do it in JavaScript natively.</p> <p>If you know the character encoding, you can calculate it yourself though.</p> <p><code>encodeURIComponent</code> assumes UTF-8 as the character encoding, so if you need that encoding, you can do,</p> <pre><code>function lengthInUtf8Bytes(str) { // Matches only the 10.. bytes that are non-initial characters in a multi-byte sequence. var m = encodeURIComponent(str).match(/%[89ABab]/g); return str.length + (m ? m.length : 0); } </code></pre> <p>This should work because of the way UTF-8 encodes multi-byte sequences. The first encoded byte always starts with either a high bit of zero for a single byte sequence, or a byte whose first hex digit is C, D, E, or F. The second and subsequent bytes are the ones whose first two bits are 10. Those are the extra bytes you want to count in UTF-8.</p> <p>The table in <a href="http://en.wikipedia.org/wiki/UTF-8" rel="noreferrer">wikipedia</a> makes it clearer</p> <pre><code>Bits Last code point Byte 1 Byte 2 Byte 3 7 U+007F 0xxxxxxx 11 U+07FF 110xxxxx 10xxxxxx 16 U+FFFF 1110xxxx 10xxxxxx 10xxxxxx ... </code></pre> <p>If instead you need to understand the page encoding, you can use this trick:</p> <pre><code>function lengthInPageEncoding(s) { var a = document.createElement('A'); a.href = '#' + s; var sEncoded = a.href; sEncoded = sEncoded.substring(sEncoded.indexOf('#') + 1); var m = sEncoded.match(/%[0-9a-f]{2}/g); return sEncoded.length - (m ? m.length * 2 : 0); } </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload