Note that there are some explanatory texts on larger screens.

plurals
  1. POCalculate distance between Zip Codes... AND users.
    primarykey
    data
    text
    <p><strong>This is more of a challenge question than something I urgently need, so don't spend all day on it guys.</strong></p> <p>I built a dating site (long gone) back in 2000 or so, and one of the challenges was calculating the distance between users so we could present your "matches" within an X mile radius. To just state the problem, given the following database schema (roughly):</p> <p>USER TABLE UserId UserName ZipCode</p> <p>ZIPCODE TABLE ZipCode Latitude Longitude</p> <p>With USER and ZIPCODE being joined on USER.ZipCode = ZIPCODE.ZipCode.</p> <p>What approach would you take to answer the following question: What other users live in Zip Codes that are within X miles of a given user's Zip Code.</p> <p>We used the <a href="http://www.census.gov/geo/www/gazetteer/places2k.html" rel="nofollow noreferrer">2000 census data</a>, which has tables for zip codes and their approximate lattitude and longitude.</p> <p>We also used the <a href="http://en.wikipedia.org/wiki/Haversine_formula" rel="nofollow noreferrer">Haversine Formula</a> to calculate distances between any two points on a sphere... pretty simple math really.</p> <p>The question, at least for us, being the 19 year old college students we were, really became how to efficiently calculate and/store distances from all members to all other members. One approach (the one we used) would be to import all the data and calculate the distance FROM every zip code TO every other zip code. Then you'd store and index the results. Something like:</p> <pre><code>SELECT User.UserId FROM ZipCode AS MyZipCode INNER JOIN ZipDistance ON MyZipCode.ZipCode = ZipDistance.MyZipCode INNER JOIN ZipCode AS TheirZipCode ON ZipDistance.OtherZipCode = TheirZipCode.ZipCode INNER JOIN User AS User ON TheirZipCode.ZipCode = User.ZipCode WHERE ( MyZipCode.ZipCode = 75044 ) AND ( ZipDistance.Distance &lt; 50 ) </code></pre> <p>The problem, of course, is that the ZipDistance table is going to have a LOT of rows in it. It isn't completely unworkable, but it is really big. Also it requires complete pre-work on the whole data set, which is also not unmanageable, but not necessarily desireable.</p> <p>Anyway, I was wondering what approach some of you gurus might take on something like this. Also, I think this is a common issue programmers have to tackle from time to time, especially if you consider problems that are just algorithmically similar. I'm interested in a thorough solution that includes at least HINTS on all the pieces to do this really quickly end efficiently. Thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload