Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Jawa posted this idea originally; here is my attempt.</p> <p><code>^</code> is the XOR function. It compares 2 binary numbers bit-by-bit and returns 0 if both bits are the same, and 1 otherwise.</p> <pre><code> 0 1 0 0 0 1 0 1 0 1 1 1 (number 1) ^ 0 1 1 1 0 1 0 1 1 0 1 1 (number 2) = 0 0 1 1 0 0 0 0 1 1 0 0 (result) </code></pre> <p>How this applies to your problem:</p> <pre><code> // In binary... 1111 ^ 0111 = 1000 // (1 bit out of 4 didn't match: 75% match) 1111 ^ 0000 = 1111 // (4 bits out of 4 didn't match: 0% match) // The same examples, except now in decimal... 15 ^ 7 = 8 (1000 in binary) // (1 bit out of 4 didn't match: 75% match) 15 ^ 0 = 15 (1111 in binary) // (4 bits out of 4 didn't match: 0% match) </code></pre> <p>How we can count these bits in MySQL:</p> <pre><code>BIT_COUNT(b'0111') = 3 // Bit count of binary '0111' BIT_COUNT(7) = 3 // Bit count of decimal 7 (= 0111 in binary) BIT_COUNT(b'1111' ^ b'0111') = 1 // (1 bit out of 4 didn't match: 75% match) </code></pre> <p>So to get the <em>similarity</em>...</p> <pre><code>// First we focus on calculating mismatch. (BIT_COUNT(b'1111' ^ b'0111') / YOUR_TOTAL_BITS) = 0.25 (25% mismatch) (BIT_COUNT(b'1111' ^ b'1111') / YOUR_TOTAL_BITS) = 0 (0% mismatch; 100% match) // Now, getting the proportion of matched bits is easy 1 - (BIT_COUNT(b'1111' ^ b'0111') / YOUR_TOTAL_BITS) = 0.75 (75% match) 1 - (BIT_COUNT(b'1111' ^ b'1111') / YOUR_TOTAL_BITS) = 1.00 (100% match) </code></pre> <p>If we could just make your <code>about_member</code> field store data as bits (and be represented by an integer), we could do all of this easily! Instead of <code>1-2-1-1-1</code>, use <code>0-1-0-0-0</code>, but without the dashes.</p> <p>Here's how PHP can help us:</p> <pre><code>bindec('01000') == 8; bindec('00001') == 1; decbin(8) == '01000'; decbin(1) == '00001'; </code></pre> <p>And finally, here's the implementation:</p> <pre><code>// Setting a member's about_member property... $about_member = '01100101'; $about_member_int = bindec($about_member); $query = "INSERT INTO members (name,about_member) VALUES ($name,$about_member_int)"; // Getting matches... $total_bits = 8; // The maximum length the member_about field can be (8 in this example) $my_member_about = '00101100'; $my_member_about_int = bindec($my_member_about_int); $query = " SELECT *, (1 - (BIT_COUNT(member_about ^ $my_member_about_int) / $total_bits)) match FROM members ORDER BY match DESC LIMIT 10"; </code></pre> <p>This last query will have selected the 10 members most similar to me!</p> <p><strong>Now, to recap, in layman's terms,</strong></p> <p>We use binary because it makes things easier; the binary number is like a long line of light switches. We want to save our "light switch configuration" as well as find members that have the most similar configurations.</p> <p>The <code>^</code> operator, given 2 light switch configurations, does a comparison for us. The result is again a series of switches; a switch will be <code>ON</code> if the 2 original switches were in different positions, and <code>OFF</code> if they were in the same position.</p> <p><code>BIT_COUNT</code> tells us how many switches are <code>ON</code>--giving us a count of how many switches were different. <code>YOUR_TOTAL_BITS</code> is the total number of switches.</p> <p>But binary numbers are still just numbers... and so a string of 1's and 0's really just represents a number like 133 or 94. But it's a lot harder to visualize our "light switch configuration" if we use decimal numbers. That's where PHP's <code>decbin</code> and <code>bindec</code> come in.</p> <p><a href="http://en.wikipedia.org/wiki/Binary_numeral_system" rel="nofollow noreferrer">Learn more about the binary numeral system.</a></p> <p>Hope this helps!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload