Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>For any element <code>array[i]</code>, you can determine if <code>i</code> points to a single byte character, the start of a 2-byte character, or the middle of a 2-byte character using the following test:</p> <p>Starting with <code>array[i-1]</code> count the number of contiguous MSB==1.</p> <p>If <code>array[i]</code> is preceded by an ODD number of 1's, then <code>array[i]</code> is the middle of a 2-byte array.</p> <p>If <code>array[i]</code> is preceded by an EVEN number of 1's, then if <code>MSB(array[i])</code> is 0, <code>array[i]</code> is a single byte character, otherwise <code>array[i]</code> is the start of a 2-byte character.</p> <p>Since we're trying to delete the character just before <code>array[i]</code>, once you determine if <code>array[i]</code> is the start or the middle of a character, then you have to run the same test for <code>array[i-x]</code>, where <code>x</code> is either 1 or 2 depending on if <code>array[i]</code> is pointing to the start or middle of a character, respectively.</p> <hr> <p>Edit (What happens when arr[0] is 1-byte, and arr[1] is 2-byte?):</p> <p>Firstly, more detail about the search for contiguous 1's: When counting contiguous 1's, the loop stops if we reach array[0], or MSB(array[j])==0.</p> <pre><code>odd=0 j = i while( j &amp;&amp; MSB(arr[j-1]) ) j-=1 odd^=1 &lt;&lt;(binary XOR) </code></pre> <p>When the loop completes, odd will be 1 if there are an odd number of contiguous 1's and odd will be 0 if there are 0 or an even number of contiguous 1's.</p> <hr> <p>If we have an array with a 1-byte character in arr[0] and a 2-byte character in arr[1], then, supposedly, i can only have the values 0, 1, or 2.</p> <ul> <li>i=0: The loop never runs because i==0. We consider there to be an <strong>EVEN</strong> number of preceding 1's because odd==0. The MSB of arr[i] is 0, so <strong>arr[i] is the start of a 1-byte character</strong>.</li> <li>i=1: The loop never runs because MSB(arr[i-1]) is 0. We consider there to be an <strong>EVEN</strong> number of contiguous 1's because odd==0. The MSB of arr[i] is 1, so <strong>arr[i] is the start of a 2-byte character</strong>.</li> <li>i=2: The loop runs one time. We find an <strong>ODD</strong> number of consecutive 1's. Because there is an odd number of preceding 1's, <strong>arr[i] is the middle of a 2-byte character</strong>.</li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload