Note that there are some explanatory texts on larger screens.

plurals
  1. POPostgresql OR conditions with an empty subquery
    text
    copied!<p>How can I optimize a query whose WHERE conditions include a check for user_id = X OR user_id IN (some subquery that might return no results)</p> <p>In my example below, queries 1 and 2 are both extremely fast (&lt; 1 ms), but query 3, which is simply an OR of the conditions in queries 1 and 2, is much slower (50 ms)</p> <p>Can somebody please explain why query 3 is so slow, and in general what types of query optimization strategies should I be pursuing to avoid this problem? I realize the subquery in my example could easily be eliminated, but in real life sometimes subqueries seem like the least complicated way to get the data I want.</p> <p>relevant code and data:</p> <p>posts data <a href="https://dl.dropbox.com/u/4597000/StackOverflow/sanitized_posts.csv" rel="nofollow">https://dl.dropbox.com/u/4597000/StackOverflow/sanitized_posts.csv</a></p> <p>users data <a href="https://dl.dropbox.com/u/4597000/StackOverflow/sanitized_users.csv" rel="nofollow">https://dl.dropbox.com/u/4597000/StackOverflow/sanitized_users.csv</a></p> <pre><code># from the shell: # &gt; createdb test CREATE TABLE posts ( id integer PRIMARY KEY NOT NULL, created_by_id integer NOT NULL, created_at integer NOT NULL ); CREATE INDEX index_posts ON posts (created_by_id, created_at); CREATE INDEX index_posts_2 ON posts (created_at); CREATE TABLE users ( id integer PRIMARY KEY NOT NULL, login varchar(50) NOT NULL ); CREATE INDEX index_users ON users (login); COPY posts FROM '/path/to/sanitized_posts.csv' DELIMITERS ',' CSV; COPY users FROM '/path/to/sanitized_users.csv' DELIMITERS ',' CSV; -- queries: -- query 1, fast: EXPLAIN ANALYZE SELECT * FROM posts WHERE created_by_id = 123 LIMIT 100; -- query 2, fast: EXPLAIN ANALYZE SELECT * FROM posts WHERE created_by_id IN (SELECT id FROM users WHERE login = 'nobodyhasthislogin') LIMIT 100; -- query 3, slow: EXPLAIN ANALYZE SELECT * FROM posts WHERE created_by_id = 123 OR created_by_id IN (SELECT id FROM users WHERE login = 'nobodyhasthislogin') LIMIT 100; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload