StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Exact phrase search does not work because of query slop parameter = 0 by default. Searching for a phrase '"Hello World"' it searches for terms with sequential positions. I wish EdgeNGramFilter had a parameter to control output positioning, this looks like an old <a href="http://web.archiveorange.com/archive/v/AAfXfzBA1VbRPUX6hW2p" rel="nofollow noreferrer">question</a>. </p> <p>By setting qs parameter to some very high value (more than maximum distance between ngrams) you can get phrases back. This partially solves problem allowing phrases, but not exact, permutations will be found as well. So that search for "contrat informatique" would match text like "...contract abandoned. Informatique..."</p> <p><img src="https://i.stack.imgur.com/ufi6l.png" alt="enter image description here"></p> <p>To support <strong>exact</strong> phrase query i end up to use <a href="https://stackoverflow.com/questions/1627689/how-to-search-for-text-fragments-in-a-database">separate fields for ngrams</a>.</p> <p><strong>Steps required:</strong></p> <p>Define separate field types to index regular values and grams:</p> <pre><code><fieldType name="text" class="solr.TextField" omitNorms="false"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <fieldType name="ngrams" class="solr.TextField" omitNorms="false"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> </code></pre> <p>Tell solr to <a href="http://wiki.apache.org/solr/SchemaXml#Copy_Fields" rel="nofollow noreferrer">copy fields</a> when indexing:</p> <p>You can define separate ngrams reflection for each field:</p> <pre><code><field name="contact_ngrams" type="ngrams" indexed="true" stored="false"/> <field name="product_ngrams" type="ngrams" indexed="true" stored="false"/> <copyField source="contact_text" dest="contact_ngrams"/> <copyField source="product_text" dest="product_ngrams"/> </code></pre> <p>Or you can put all ngrams into one field:</p> <pre><code><field name="heap_ngrams" type="ngrams" indexed="true" stored="false"/> <copyField source="*_text" dest="heap_ngrams"/> </code></pre> <p>Note that you'll not be able to separate boosters in this case.</p> <p>And the last thing is to specify ngrams fields and boosters in the query. One way is to configure your application. Another way is to specify "appends" params in the solrconfig.xml</p> <pre><code> <lst name="appends"> <str name="qf">heap_ngrams</str> </lst> </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload