Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This is a largely simplified version of a function I use in an app that built around 3 years ago. Adapted to the question at hand.</p> <ul> <li><p>Finds locations in the perimeter of a point using a <em>box</em>. One could do this with a circle to get more accurate results, but this is only meant to be an approximation to begin with.</p></li> <li><p>Ignores the fact that the world is not flat. My application was only meant for a local region, a few 100 kilometers across. And the search perimeter only spans a few kilometers across. Making the world flat is good enough for the purpose. (Todo: A better approximation for the ratio lat/lon depending on the geolocation might help.)</p></li> <li><p>Operates with geocodes like you get from Google maps.</p></li> <li><p>Works with standard PostgreSQL <strong>without extension</strong> (no PostGis required), tested on PostgreSQL 9.1 and 9.2.</p></li> </ul> <p>Without index, one would have to calculate the distance for every row in the base table and filter the closest ones. Extremely expensive with big tables.</p> <p><strong>Edit:</strong><br> I rechecked and the current implementation allows a GisT index on points (Postgres 9.1 or later). Simplified the code accordingly.</p> <p><strike>The <strong>major trick</strong> is to use a functional <strong>GiST index of <em>boxes</em></strong>, even though the column is just a point. This makes it possible to use the existing <a href="http://www.postgresql.org/docs/current/interactive/gist.html" rel="nofollow noreferrer">GiST implementation</a>.</strike></p> <p>With such a (very fast) search, we can get all locations inside a box. The remaining problem: we know the number of rows, but we do not know the size of the box they are in. That's like knowing part of the answer, but not the question.</p> <p>I use a similar <strong>reverse-lookup</strong> approach to the one described in more detail in <a href="https://dba.stackexchange.com/questions/18300/can-spatial-index-help-a-range-order-by-limit-query/22500#22500">this related answer on dba.SE</a>. (Only, I am not using partial indexes here - might actually work, too).</p> <p>Iterate through an array of pre-defined search-steps, from very small up to "just big enough to hold at least enough locations". Means we have to run a couple of (very fast) queries to get to the size for the search box.</p> <p>Then search the base table with this box and calculate actual distance for only the few rows returned from the index. There will usually be some surplus since we found the box holding <em>at least</em> enough locations. By taking the closest ones, we effectively round the corners of the box. You could force this effect by making the box a notch bigger (multiply <code>radius</code> in the function by sqrt(2) to get completely <em>accurate</em> results, but I wouldn't go all out, since this is approximating to begin with).</p> <p>This would be even faster and simpler with an <a href="http://www.postgresql.org/docs/current/interactive/spgist.html" rel="nofollow noreferrer">SP GiST</a> index, available in the latest version of PostgreSQL. But I don't know if that's possible yet. We'd need an actual implementation for the data type and I didn't have the time to dive into it. If you find a way, promise to report back!</p> <p>Given this simplified table with some example values (<code>adr</code> .. address):</p> <pre><code>CREATE TABLE adr(adr_id int, adr text, geocode point); INSERT INTO adr (adr_id, adr, geocode) VALUES (1, 'adr1', '(48.20117,16.294)'), (2, 'adr2', '(48.19834,16.302)'), (3, 'adr3', '(48.19755,16.299)'), (4, 'adr4', '(48.19727,16.303)'), (5, 'adr5', '(48.19796,16.304)'), (6, 'adr6', '(48.19791,16.302)'), (7, 'adr7', '(48.19813,16.304)'), (8, 'adr8', '(48.19735,16.299)'), (9, 'adr9', '(48.19746,16.297)'); </code></pre> <p>The index looks like this:</p> <pre><code>CREATE INDEX adr_geocode_gist_idx ON adr USING gist (geocode); </code></pre> <p><a href="http://www.sqlfiddle.com/#!1/08d06/1" rel="nofollow noreferrer"><strong>-> SQLfiddle</strong></a></p> <p>You'll have to adjust the home area, the steps and the scaling factor to your needs. As long as you search in boxes of a few kilometers around a point, a flat earth is a good enough approximation.</p> <p>You need to understand plpgsql well to work with this. I feel I have done quite enough here.</p> <pre><code>CREATE OR REPLACE FUNCTION f_find_around(_lat double precision, _lon double precision, _limit bigint = 50) RETURNS TABLE(adr_id int, adr text, distance int) AS $func$ DECLARE _homearea CONSTANT box := '(49.05,17.15),(46.35,9.45)'::box; -- box around legal area -- 100m = 0.0008892 250m, 340m, 450m, 700m,1000m,1500m,2000m,3000m,4500m,7000m _steps CONSTANT real[] := '{0.0022,0.003,0.004,0.006,0.009,0.013,0.018,0.027,0.040,0.062}'; -- find optimum _steps by experimenting geo2m CONSTANT integer := 73500; -- ratio geocode(lon) to meter (found by trial &amp; error with google maps) lat2lon CONSTANT real := 1.53; -- ratio lon/lat (lat is worth more; found by trial &amp; error with google maps in (Vienna) _radius real; -- final search radius _area box; -- box to search in _count bigint := 0; -- count rows _point point := point($1,$2); -- center of search _scalepoint point := point($1 * lat2lon, $2); -- lat scaled to adjust BEGIN -- Optimize _radius IF (_point &lt;@ _homearea) THEN FOREACH _radius IN ARRAY _steps LOOP SELECT INTO _count count(*) FROM adr a WHERE a.geocode &lt;@ box(point($1 - _radius, $2 - _radius * lat2lon) , point($1 + _radius, $2 + _radius * lat2lon)); EXIT WHEN _count &gt;= _limit; END LOOP; END IF; IF _count = 0 THEN -- nothing found or not in legal area EXIT; ELSE IF _radius IS NULL THEN _radius := _steps[array_upper(_steps,1)]; -- max. _radius END IF; _area := box(point($1 - _radius, $2 - _radius * lat2lon) , point($1 + _radius, $2 + _radius * lat2lon)); END IF; RETURN QUERY SELECT a.adr_id ,a.adr ,((point (a.geocode[0] * lat2lon, a.geocode[1]) &lt;-&gt; _scalepoint) * geo2m)::int4 AS distance FROM adr a WHERE a.geocode &lt;@ _area ORDER BY distance, a.adr, a.adr_id LIMIT _limit; END $func$ LANGUAGE plpgsql; </code></pre> <p>Call:</p> <pre><code>SELECT * FROM f_find_around (48.2, 16.3, 20); </code></pre> <p>Returns a list of <code>$3</code> locations, if there are enough in the defined maximum search area.<br> Sorted by actual distance.</p> <h3>Further improvements</h3> <p>Build a function like:</p> <pre><code>CREATE OR REPLACE FUNCTION f_geo2m(double precision, double precision) RETURNS point AS $BODY$ SELECT point($1 * 111200, $2 * 111400 * cos(radians($1))); $BODY$ LANGUAGE sql IMMUTABLE; COMMENT ON FUNCTION f_geo2m(double precision, double precision) IS 'Project geocode to approximate metric coordinates. SELECT f_geo2m(48.20872, 16.37263) --'; </code></pre> <p>The (literally) global constants <code>111200</code> and <code>111400</code> are optimized for my area (Austria) from the <a href="http://en.wikipedia.org/wiki/Longitude#Length_of_a_degree_of_longitude" rel="nofollow noreferrer">Length of a degree of longitude</a> and <a href="http://en.wikipedia.org/wiki/Latitude#The_length_of_a_degree_of_latitude" rel="nofollow noreferrer">The length of a degree of latitude</a>, but basically just work all over the world.</p> <p>Use it to add a scaled geocode to the base table, ideally a <em>generated column</em> like outlined in this answer:<br> <a href="https://stackoverflow.com/questions/15169410/how-do-you-do-date-math-that-ignores-the-year/15179731#15179731">How do you do date math that ignores the year?</a><br> Refer to <strong>3. Black magic version</strong> where I walk you through the process.<br> Then you can simplify the function some more: Scale input values once and remove redundant calculations.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload