Note that there are some explanatory texts on larger screens.

plurals
  1. POSQL join with a range of values (int ranges, date ranges, whatever)
    text
    copied!<p>I have two tables, the first is a big table (millions of rows), with the most interesting column being an integer I'll just call "key." I believe this solution would be identical for date or datetime ranges as well though.</p> <p>The second table is much smaller (thousands of rows) with a bunch of attributes that are interesting to me which are defined over a range of keys. It has the following structure:</p> <p>key_lower_bound : int key_upper_bound : int interesting_value1 : float interesting_value2 : int interesting_value3 : varchar(50) ...</p> <p>I want to look up all of the values in the first table and "join" them with the second table based on whether the key in the first table falls inside the interval [key_lower_bound, key_upper_bound).</p> <p>This is sort of like a sparse inner product or sparse dot product mathematically, but it's a little weird since there are these ranges involved in the second table. Still, if I were to write this up in code it would be an O(|first table| + |second table|) algorithm. I would keep a pointer into both (sorted) lists and walk through them each in order to determine if each key in the first table belonged in the range of the second table. The trick is that I am not iterating through the second list each time I examine a key in the first table because both lists are sorted.</p> <p>When I construct the most obivous SQL query (involving checking that key is > key_lower_bound and &lt; key_upper_bound) it takes WAY too long.</p> <p>There is some kind of quadratic behavior going on with that naive query because I think the query engine is doing each compare against each row in the second table, when in reality, if the second table is sorted by key_lower_bounds this shouldn't be necessary. So I'm getting a O(|first table| x |second table|) kind of behavior instead of the desired O(|first table| + |second table|) behavior.</p> <p>Is it possible to get a linear SQL query to do this?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload