Note that there are some explanatory texts on larger screens.

plurals
  1. POSolr: exact phrase query with a EdgeNGramFilterFactory
    primarykey
    data
    text
    <p>In Solr (3.3), is it possible to make a field letter-by-letter searchable through a <code>EdgeNGramFilterFactory</code> and also sensitive to phrase queries?</p> <p>By example, I'm looking for a field that, if containing "contrat informatique", will be found if the user types:</p> <ul> <li>contrat</li> <li>informatique</li> <li>contr</li> <li>informa</li> <li>"contrat informatique"</li> <li>"contrat info"</li> </ul> <p>Currently, I made something like this:</p> <pre><code>&lt;fieldtype name="terms" class="solr.TextField"&gt; &lt;analyzer type="index"&gt; &lt;charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/&gt; &lt;filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/&gt; &lt;tokenizer class="solr.LowerCaseTokenizerFactory"/&gt; &lt;filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/&gt; &lt;/analyzer&gt; &lt;analyzer type="query"&gt; &lt;charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/&gt; &lt;filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/&gt; &lt;tokenizer class="solr.LowerCaseTokenizerFactory"/&gt; &lt;/analyzer&gt; &lt;/fieldtype&gt; </code></pre> <p>...but it failed on phrase queries.</p> <p>When I look in the schema analyzer in solr admin, I find that "contrat informatique" generated the followings tokens:</p> <pre><code>[...] contr contra contrat in inf info infor inform [...] </code></pre> <p>So the query works with "contrat in" (consecutive tokens), but not "contrat inf" (because this two tokens are separated).</p> <p>I'm pretty sure any kind of stemming can work with phrase queries, but I cannot find the right tokenizer of filter to use before the <code>EdgeNGramFilterFactory</code>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload