Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Unfortunately, doing this in PHP is prohibitively expensive (high CPU and memory utilization.) However, you can certainly apply the algorithm to small data sets.</p> <p>To specifically expand on how you can create a server meltdown: couple of built-in PHP functions will determine "distance" between strings: <strong>levenshtein</strong> and <strong>similar_text</strong>.</p> <p>Dummy data: (pretend they're news headlines)<pre>$titles = &lt;&lt;&lt; EOF Apple Apples Orange Oranges Banana EOF;</p> <p>$titles = explode("\n", $titles );</pre></p> <p>At this point, <strong>$titles</strong> should just be an array of strings. Now, create a matrix and compare each headline against EVERY other headline for similarity. In other words, for 5 headlines, you will get a 5 x 5 matrix (25 entries.) That's where the CPU and memory sink goes in. </p> <p>That's why this method (via PHP) can't be applied to thousands of entries. But if you wanted to:</p> <pre>$matches = array(); foreach( $titles as $title ) { $matches[$title] = array(); foreach( $titles as $compare_to ) { $matches[$title][$compare_to] = levenshtein( $compare_to, $title ); } asort( $matches[$title], SORT_NUMERIC ); }</pre> <p>At this point what you basically have is a matrix with "text distances." In concept (not in real data) it looks sort of like this table below. Note how there is a set of 0 values that go diagonally - that means that in the matching loop, two identical words are -- well, identical.</p> <pre> Apple Apples Orange Oranges Banana Apple 0 1 5 6 6 Apples 1 0 6 5 6 Orange 5 6 0 1 5 Oranges 6 5 1 0 5 Banana 6 6 5 5 0 </pre> <p>The actual $matches array looks sort of like this (truncated):</p> <pre>Array ( [Apple] => Array ( [Apple] => 0 [Apples] => 1 [Orange] => 5 [Banana] => 6 [Oranges] => 6 ) [Apples] => Array ( ... </pre> <p>Anyhow, it's up to you to (by experimentation) determine what a good numerical distance cutoff might mostly match - and then apply it. Otherwise, read up on sphinx-search and use it - since it does have PHP libraries.</p> <p>Orange you glad you asked about this?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload