Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL grouping query optimization
    text
    copied!<p>I have three tables: categories, articles, and article_events, with the following structure</p> <pre><code>categories: id, name (100,000 rows) articles: id, category_id (6000 rows) article_events: id, article_id, status_id (20,000 rows) </code></pre> <p>The highest article_events.id for each article row describes the current status of each article.</p> <p>I'm returning a table of categories and how many articles are in them with a most-recent-event status_id of '1'.</p> <p>What I have so far works, but is fairly slow (10 seconds) with the size of my tables. Wondering if there's a way to make this faster. All the tables have proper indexes as far as I know.</p> <pre><code>SELECT c.id, c.name, SUM(CASE WHEN e.status_id = 1 THEN 1 ELSE 0 END) article_count FROM categories c LEFT JOIN articles a ON a.category_id = c.id LEFT JOIN ( SELECT article_id, MAX(id) event_id FROM article_events GROUP BY article_id ) most_recent ON most_recent.article_id = a.id LEFT JOIN article_events e ON most_recent.event_id = e.id GROUP BY c.id </code></pre> <p>Basically I have to join to the events table twice, since asking for the status_id along with the MAX(id) just returns the first status_id it finds, and not the one associated with the MAX(id) row.</p> <p>Any way to make this better? or do I just have to live with 10 seconds? Thanks!</p> <p>Edit:</p> <p>Here's my EXPLAIN for the query:</p> <pre><code>ID | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra --------------------------------------------------------------------------------------------------------------------------- 1 | PRIMARY | c | index | NULL | PRIMARY | 4 | NULL | 124044 | Using index; Using temporary; Using filesort 1 | PRIMARY | a | ref | category_id | category_id | 4 | c.id | 3 | 1 | PRIMARY | &lt;derived2&gt; | ALL | NULL | NULL | NULL | NULL | 6351 | 1 | PRIMARY | e | eq_ref | PRIMARY | PRIMARY | 4 | most_recent.event_id | 1 | 2 | DERIVED | article_events | ALL | NULL | NULL | NULL | NULL | 19743 | Using temporary; Using filesort </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload