Note that there are some explanatory texts on larger screens.

plurals
  1. POPoorly performing Mysql subquery -- can I turn it into a Join?
    primarykey
    data
    text
    <p>I have a subquery problem that is causing poor performance... I was thinking that the subquery could be re-written using a join, but I'm having a hard time wrapping my head around it.</p> <p>The gist of the query is this: For a given combination of EmailAddress and Product, I need to get a list of the IDs that are NOT the latest.... these orders are going to be marked as 'obsolete' in the table which would leave only that latest order for a a given combination of EmailAddress and Product... (does that make sense?)</p> <p><strong>Table Definition</strong></p> <pre><code>CREATE TABLE `sandbox`.`OrderHistoryTable` ( `id` INT( 11 ) NOT NULL AUTO_INCREMENT , `EmailAddress` VARCHAR( 100 ) NOT NULL , `Product` VARCHAR( 100 ) NOT NULL , `OrderDate` DATE NOT NULL , `rowlastupdated` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP , PRIMARY KEY ( `id` ) , KEY `EmailAddress` ( `EmailAddress` ) , KEY `Product` ( `Product` ) , KEY `OrderDate` ( `OrderDate` ) ) ENGINE = MYISAM DEFAULT CHARSET = latin1; </code></pre> <p><strong>Query</strong></p> <pre><code>SELECT id FROM OrderHistoryTable AS EMP1 WHERE OrderDate not in ( Select max(OrderDate) FROM OrderHistoryTable AS EMP2 WHERE EMP1.EmailAddress = EMP2.EmailAddress AND EMP1.Product IN ('ProductA','ProductB','ProductC','ProductD') AND EMP2.Product IN ('ProductA','ProductB','ProductC','ProductD') ) </code></pre> <p><strong>Explanation of duplicate 'IN' statements</strong></p> <pre><code>13 bob@aol.com ProductA 2010-10-01 15 bob@aol.com ProductB 2010-20-02 46 bob@aol.com ProductD 2010-20-03 57 bob@aol.com ProductC 2010-20-04 158 bob@aol.com ProductE 2010-20-05 206 bob@aol.com ProductB 2010-20-06 501 bob@aol.com ProductZ 2010-20-07 </code></pre> <p>The results of my query should be | 13 | | 15 | | 46 | | 57 |</p> <p>This is because, in the orders listed, those 4 have been 'superceded' by a newer order for a product in the same category. This 'category' contains prodcts A, B, C &amp; D.</p> <p>Order ids 158 and 501 show no other orders in their respective categories based on the query.</p> <p><strong>Final Query based off of accepted answer below:</strong> I ended up using the following query with no subquery and got about 3X performance (30 sec down from 90 sec). I also now have a separate 'groups' table where I can enumerate the group members instead of spelling them out in the query itself...</p> <pre><code>SELECT DISTINCT id, EmailAddress FROM ( SELECT a.id, a.EmailAddress, a.OrderDate FROM OrderHistoryTable a INNER JOIN OrderHistoryTable b ON a.EmailAddress = b.EmailAddress INNER JOIN groups g1 ON a.Product = g1.Product INNER JOIN groups g2 ON b.Product = g2.Product WHERE g1.family = 'ProductGroupX' AND g2.family = 'ProductGroupX' GROUP BY a.id, a.OrderDate, b.OrderDate HAVING a.OrderDate &lt; MAX(b.OrderDate) ) dtX </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload