Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to Index artist/band names using lucene/solr out of my MySQL server
    primarykey
    data
    text
    <p>I'm interesting in indexing full names of artists/bands using Lucene/Solr out of my MySQL server.</p> <p>I have a DB table called 'entity_aliases' which holds many variations of bands/artists in my system. the table look like this:</p> <pre><code>entity_aliases int(11) auto inc. PK entity_type enum(artist, band) entity_id int(11) entity_alias varchar(100) + full text search index. </code></pre> <p>Example entity_alias (field) values:</p> <pre><code>Beyoncé Beyoncé Giselle Knowles Giselle Knowles ... </code></pre> <p><strong>General explanation about the type of queries I'd like to perform:</strong></p> <p>My service needs to provide information about artists/bands. In order to do so - my clients need to provide me with the entity name.</p> <p>*My clients (sometime) provide me an entity name with typos or a name that not found exactly in the DB (in our case "Beyonce Knowles" also note the European "é").</p> <p>So the demands are:</p> <ol> <li>I'm using sharded MySQL - so the 'entity_aliases' is also sharded. it need to index more than 1 MySQL server.</li> <li>Its need to support 80M names.</li> <li>Nice to have: ignore/overcome minor typos or European characters (fuzzy search).</li> <li>Need to be supported by PHP (CakePHP).</li> <li>entity names probably won't exceed 20-25 chars</li> <li>The query itself is very simple - I provide a "name" and in return I'd like to get a list of similar entities (entity_id and entity_type) and if possible - a score.</li> <li>I need to index entities on-the-fly and the index should be affect immediately.</li> </ol> <p><strong>Things I'd like to know:</strong></p> <ol> <li>is doable using lucene/solr?</li> <li>is there any better solution that I need to consider?</li> <li>how my schema should look like?</li> </ol> <p>Thanks!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload