Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat is an effective way to search world-wide location names with ElasticSearch?
    text
    copied!<p>I have location information provided by <a href="http://www.geonames.org/" rel="nofollow noreferrer">GeoNames.org</a> parsed into a relational database. Using this information, I am attempting to build an ElasticSearch index that contains populated place (city) names, administrative division (state, province, etc.) names, country names and country codes. My goal is to provide a location search that is similar to Google Maps':</p> <p><img src="https://i.stack.imgur.com/7BVxa.png" alt="Google Maps"></p> <p>I don't need the cool bold highlighting, but I do need the search to return similar results in a similar way. I've tried creating a mapping with a single location field consisting of the entire location name (e.g., "Round Rock, TX, United States") and I've also tried having five separate fields consisting of each piece of a location. I've tried keyword and prefix queries and edgengram analyzers; I have been unsuccessful in finding the correct configuration to get this working properly.</p> <p>What kinds of analyzers--both index and search--should I be looking at to accomplish my goals? This search doesn't have to be as perfected as Google's but I'd like it to be at least similar.</p> <p>I do want to support partial-name matches, which is why I've been fiddling with edgengram. For example, a search of "round r" should match Round Rock, TX, United States. Also, I would prefer that results whose populated place (city) names begin with the exact search term be ranked higher than other results. For example, a search of "round ro" should match Round Rock, TX, United States before Round, Some Province, RO (Romania). I hope I've made this clear enough.</p> <p>Here is my current index configuration (this is an anonymous type in C# that is later serialized to JSON and passed to the ElasticSearch API):</p> <pre><code>settings = new { index = new { number_of_shards = 1, number_of_replicas = 0, refresh_interval = -1, analysis = new { analyzer = new { edgengram_index_analyzer = new { type = "custom", tokenizer = "index_tokenizer", filter = new[] { "lowercase", "asciifolding" }, char_filter = new[] { "no_commas_char_filter" }, stopwords = new object[0] }, search_analyzer = new { type = "custom", tokenizer = "standard", filter = new[] { "lowercase", "asciifolding" }, char_filter = new[] { "no_commas_char_filter" }, stopwords = new object[0] } }, tokenizer = new { index_tokenizer = new { type = "edgeNGram", min_gram = 1, max_gram = 100 } }, char_filter = new { no_commas_char_filter = new { type = "mapping", mappings = new[] { ",=&gt;" } } } } } }, mappings = new { location = new { _all = new { enabled = false }, properties = new { populatedPlace = new { index_analyzer = "edgengram_index_analyzer", type = "string" }, administrativeDivision = new { index_analyzer = "edgengram_index_analyzer", type = "string" }, administrativeDivisionAbbreviation = new { index_analyzer = "edgengram_index_analyzer", type = "string" }, country = new { index_analyzer = "edgengram_index_analyzer", type = "string" }, countryCode = new { index_analyzer = "edgengram_index_analyzer", type = "string" }, population = new { type = "long" } } } } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload