StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Rushik, here are a few ideas:</p> <ul> <li>Consider using <a href="http://lucene.apache.org/solr/" rel="nofollow">Solr</a>. It is much easier to start using it, rather than bare Lucene.</li> <li>Build a Lucene/Solr index of the file. It appears that a document per customer is enough, if you use a multi-valued field or two different fields for addresses.</li> <li>Do you have a unique id per person? To use Solr, you need one. In Lucene, you can get away without using a unique id.</li> <li>Store the country code as a "keyword". If you only require exact match for date of birth, you may do the same. For range queries, you will need another representation.</li> <li>I assume your customer list is smaller than the file. A possible policy would be to daily index the changes in the file (Here a unique id is really handy - otherwise you need to delete by query, which may miss the mark). Then you can optimize the index, and after that run a search for your updated customer list.</li> <li>What you describe is a <a href="http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/BooleanQuery.html" rel="nofollow">BooleanQuery</a>, Whose clauses are fuzzy queries for the first and last names and term queries for the other fields. You can create the query programmaticaly or using the <a href="http://lucene.apache.org/java/3_0_2/queryparsersyntax.html" rel="nofollow">query parser</a>.</li> <li>Consider using soundex for names as described <a href="http://sujitpal.blogspot.com/2007/12/spelling-checker-with-lucene.html" rel="nofollow">here</a>.</li> </ul>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload