Note that there are some explanatory texts on larger screens.

plurals
  1. POAccented characters are not indexed in Sphinx
    text
    copied!<p>I have problems with searching words that contains accented characters. I use Sphinx 2.1.1, Linux, MsSQL 2005 via odbc (freetds).</p> <p>Here is my sphinx.conf:</p> <pre><code> source parentSource { type = odbc ... } index parentIndex { morphology = stem_en charset_type = utf-8 charset_table = 0..9, a..z, A..Z-&gt;a..z, ... (mapping taken from http://sphinxsearch.com/wiki/doku.php?id=charset_tables for common, A-Z) ... } </code></pre> <p>After changing config, I've reindexed all indexes and restarted searchd. When I search for "Muller" - I get results that contain only "Muller". When I search for "Müller" - I also get only "Muller" results. But there are also "Müller" records in db, that not indexed properly. Mapping for ü (U+00FC->u) present in config. I mean after I've added accented characters to charset_table, it (accented characters) is converted when I search, but not when content is indexed, as I understand.</p> <p>When I run indexer with --buildstops option, I found next record in output file: "mller". And yes, when I search for "mller" - I get "Müller" results (but no "Muller" of course).</p> <p>What I need to do for search by "Muller/Müller" give results for both "Muller" and "Müller"?</p> <p>PS: collation used for column (and for wohle database) is SQL_LATIN1_GENERAL_CP1_CI_AS. I change column type from varchar to nvarchar, but it doesn't help. "Müller" records displaued properly on the site (without ???) and when I run indexer with --dump-rows.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload