Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <blockquote> <p>In my example code below, will every <code>$hash</code> be unique?</p> </blockquote> <p><em>Almost.</em> (Which, I guess, means "no, but in a way that's easily fixed".) Your function consists of a sequence of independent steps; the overall function is bijective (reversible) if and only if every single one of those steps is. (Do you see why?)</p> <p>Now, each step has one of the following forms:</p> <pre><code> $key = ($key ^ CONSTANT) ^ ($key &gt;&gt; NUM_BITS); $key = ($key ^ CONSTANT) ^ ($key &lt;&lt; NUM_BITS); </code></pre> <p>with <code>NUM_BITS != 0</code>.</p> <p>We can actually treat these as variants of a single form, by viewing the former as <em>almost</em> equivalent to this:</p> <pre><code> $key = invert_order_of_bits($key); # clearly bijective $constant = invert_order_of_bits(CONSTANT); $key = ($key ^ $constant) ^ ($key &lt;&lt; NUM_BITS); $key = invert_order_of_bits($key); # clearly bijective </code></pre> <p>So all we need is to show that this:</p> <pre><code> $key = ($key ^ CONSTANT) ^ ($key &lt;&lt; NUM_BITS); </code></pre> <p>is bijective. Now, XOR is commutative and associative, so the above is equivalent to this:</p> <pre><code> $key = $key ^ ($key &lt;&lt; NUM_BITS); $key = $key ^ CONSTANT; </code></pre> <p>and <code>(x ^ y) ^ y == x ^ (y ^ y) == x ^ 0 == x</code>, so clearly XOR-ing with a constant value is reversible (by re-XOR-ing with the same value); so all we have to show is that this is bijective:</p> <pre><code> $key = $key ^ ($key &lt;&lt; NUM_BITS); </code></pre> <p>whenever <code>NUM_BITS != 0</code>.</p> <p>Now, I'm not writing a rigorous proof, so I'll just give a <em>single</em> reasoned-out example of how to reverse this. Suppose that <code>$key ^ ($key &lt;&lt; 9)</code> is</p> <pre><code>0010 1010 1101 1110 0010 0101 0000 1100 </code></pre> <p>How do we obtain <code>$key</code>? Well, we know that the last nine bits of <code>$key &lt;&lt; 9</code> are all zeroes, so we know that the last nine bits of <code>$key ^ ($key &lt;&lt; 9)</code> are the same as the last nine bits of <code>$key</code>. So <code>$key</code> looks like</p> <pre><code>bbbb bbbb bbbb bbbb bbbb bbb1 0000 1100 </code></pre> <p>so <code>$key &lt;&lt; 9</code> looks like</p> <pre><code>bbbb bbbb bbbb bb10 0001 1000 0000 0000 </code></pre> <p>so <code>$key</code> looks like</p> <pre><code>bbbb bbbb bbbb bb00 0011 1101 0000 1100 </code></pre> <p>(by XOR-ing <code>$key ^ ($key &lt;&lt; 9)</code> with <code>$key &lt;&lt; 9</code>), so <code>$key &lt;&lt; 9</code> looks like</p> <pre><code>bbbb b000 0111 1010 0001 1000 0000 0000 </code></pre> <p>so <code>$key</code> looks like</p> <pre><code>bbbb b010 1010 0100 0011 1101 0000 1100 </code></pre> <p>so <code>$key &lt;&lt; 9</code> looks like</p> <pre><code>0101 1000 0111 1010 0001 1000 0000 0000 </code></pre> <p>so <code>$key</code> looks like</p> <pre><code>0111 0010 1010 0100 0011 1101 0000 1100 </code></pre> <p>So . . . why do I say "almost" rather than "yes"? Why is your hash-function not <em>perfectly</em> bijective? It's because in PHP, the bitwise shift operators <code>&gt;&gt;</code> and <code>&lt;&lt;</code> are not <em>quite</em> symmetric, and while <code>$key = $key ^ ($key &lt;&lt; NUM_BITS)</code> is completely reversible, <code>$key = $key ^ ($key &gt;&gt; NUM_BITS)</code> is not. (Above, when I wrote that the two types of steps were "<em>almost</em> equivalent", I really <em>meant</em> that "almost". It makes a difference!) You see, whereas <code>&lt;&lt;</code> treats the sign bit just like any other bit, and shifts it out of existence (bringing in a zero-bit on the right), <code>&gt;&gt;</code> treats the sign bit specially, and "extends" it: the bit that it brings in on the left is equal to the sign bit. (N.B. Your question mentions "unsigned 32bit" values, but PHP doesn't actually support that; its bitwise operations are always on <em>signed</em> integers.)</p> <p>Due to this sign extension, if <code>$key</code> starts with a <code>0</code>, then <code>$key &gt;&gt; NUM_BITS</code> starts with a <code>0</code>, and if <code>$key</code> starts with a <code>1</code>, then <code>$key &gt;&gt; NUM_BITS</code> also starts with a <code>1</code>. In either case, <code>$key ^ ($key &gt;&gt; NUM_BITS)</code> will start with a <code>0</code>. You've lost exactly one bit of entropy. If you give me <code>$key ^ ($key &gt;&gt; 9)</code>, and don't tell me whether <code>$key</code> is negative, then the best I can do is compute two possible values for <code>$key</code>: one negative, one positive-or-zero.</p> <p>You perform two steps that use right-shift instead of left-shift, so you lose two bits of entropy. (I'm hand-waving slightly &mdash; all I've actually demonstrated is that you lose <em>at least</em> one bit and <em>at most</em> two bits &mdash; but I'm confident that, due to the nature of the steps between those right-shift steps, you do actually lose two full bits.) For any given output value, there are exactly four distinct input-values that could yield it. So it's not unique, but it's <em>almost</em> unique; and it's easily fixed, by either:</p> <ul> <li>changing the two right-shift steps to use left-shifts instead; or</li> <li>moving both of the right-shift steps to the start of the function, before any left-shift steps, and saying that outputs are unique for inputs between 0 and 2<sup>31</sup>&minus;1 rather than inputs between 0 and 2<sup>32</sup>&minus;1.</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload