Note that there are some explanatory texts on larger screens.

plurals
  1. POwhy is select much slower with left join+where
    text
    copied!<p>I am attempting to optimize a MySQL query that is taking a long time to process. Imagine we have two tables, a users table and a purchases table. Both tables have ~20,000 rows in them.</p> <pre><code>mysql&gt; SELECT NOW(),u.id FROM users u LEFT JOIN purchases p ON p.user_id = u.id WHERE p.website_id = 1234 ORDER BY u.total_paid DESC LIMIT 10; +---------------------+-------+ | NOW() | id | +---------------------+-------+ *snip* +---------------------+-------+ 10 rows in set (0.06 sec) </code></pre> <p>Not super fast but pretty snappy. If I change nothing other than change <code>u.id</code> to <code>u.*</code> it will slow down dramatically:</p> <pre><code>mysql&gt; SELECT NOW(),u.* FROM users u LEFT JOIN purchases p ON p.user_id = u.id WHERE p.website_id = 1234 ORDER BY u.total_paid DESC LIMIT 10; +---------------------+-------+ *snip* +---------------------+-------+ 10 rows in set (0.37 sec) </code></pre> <p>Before you say "Well, you should never use <code>select *</code>" consider that it slowly creeps up to that length of time the more fields you add, i.e. naming half of the fields to select will cause the query execute in ~0.20 seconds and no field on the users table is larger than a <code>varchar(255)</code>.</p> <p>However, if I take the ids from my relatively snappy query and I simply:</p> <pre><code>mysql&gt; SELECT * FROM users WHERE id IN (*snip*); +---------------------+-------+ *snip* +---------------------+-------+ 10 rows in set (0.01 sec) </code></pre> <p>So my two queries: <code>select u.id</code> plus <code>select u.* where id in</code> is faster than what I assume is a similar query. What the heck?</p> <p><strong>More Info:</strong> The users table has about 30 fields on it. Again, no field is larger than a <code>varchar(255)</code></p> <p><strong>More More Info:</strong> The EXPLAIN for both queries is this:</p> <pre><code>*************************** 1. row *************************** id: 1 select_type: SIMPLE table: p type: ref possible_keys: PRIMARY,user_id_index,website_id_index,website_user_id_index,website_created_index,website_type_created_index,website_type_index,purchase_user_id_type_index,user_id_website_id_index,website_id_user_id_index key: website_id_user_id_index key_len: 9 ref: const rows: 9976 Extra: Using where; Using index; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: database.p.user_id rows: 1 Extra: </code></pre> <p><strong>Edit</strong> Could it possibly be that since it's using temporary/filesort it has to select * from users, not knowing which rows will end up in the final result set? So it may seem like a trivial amount of extra data but in reality it's the difference between selecting a large chunk of the table? If this is correct, any suggestions?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload