Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can I tokenize a string in MySQL?
    primarykey
    data
    text
    <p><strong>My project is importing a sizable collection +500K rows of data from flat Excel files</strong>, which are manually created by a team of people. Now the problem is that it all needs to be normalized, for client searching. For example, the company field will have multiple company spellings and include branches, such as "IBM" and then "IBM Inc." and "IBM Japan" etc. Additionally, I have product names that alphanumeric, such as "A46-Rhizonme Pentahol", which <strong>SOUNDEX alone cannot handle</strong>.</p> <p>I can solve the issue in the long term by having all the data input be through a web form, with an <strong>AJAX auto-suggest</strong>. Until then however, I still need to deal with the massive collection of existing data. This brings me to what I believe is a good process, based on what I've read here:</p> <p><a href="http://msdn.microsoft.com/en-us/magazine/cc163731.aspx" rel="nofollow">http://msdn.microsoft.com/en-us/magazine/cc163731.aspx</a></p> <p>Steps to create a custom Fuzzy Logic Lookup, and Fuzzy Logic Grouping</p> <ol> <li>List item</li> <li>tokenize strings into keywords</li> <li>calculate keyword TF-IDF (total frequency - inverse document frequecy)</li> <li>calculate levenshtein distance between keywords</li> <li>calculate Soundex on available alpha strings</li> <li>determine context of keywords</li> <li>place keywords, based on context, into separate DB tables, such as "Companies", "Products", "Ingredients"</li> </ol> <p>I've been Googling, searching StackOverflow, reading over MySQL.com discussions, etc. about this issue, to attempt to find a prebuilt solution. Any ideas?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload