Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing a MySQL query with a large IN() clause or join on derived table
    primarykey
    data
    text
    <p>Let's say I need to query the associates of a corporation. I have a table, "transactions", which contains data on every transaction made.</p> <pre><code>CREATE TABLE `transactions` ( `transactionID` int(11) unsigned NOT NULL, `orderID` int(11) unsigned NOT NULL, `customerID` int(11) unsigned NOT NULL, `employeeID` int(11) unsigned NOT NULL, `corporationID` int(11) unsigned NOT NULL, PRIMARY KEY (`transactionID`), KEY `orderID` (`orderID`), KEY `customerID` (`customerID`), KEY `employeeID` (`employeeID`), KEY `corporationID` (`corporationID`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; </code></pre> <p>It's fairly straightforward to query this table for associates, but there's a twist: A transaction record is registered once per employee, and so there may be multiple records for one corporation per order.</p> <p>For example, if employees A and B from corporation 1 were both involved in selling a vacuum cleaner to corporation 2, there would be two records in the "transactions" table; one for each employee, and both for corporation 1. This must not affect the results, though. A trade from corporation 1, regardless of how many of its employees were involved, must be treated as one.</p> <p>Easy, I thought. I'll just make a join on a derived table, like so:</p> <pre><code>SELECT corporationID FROM transactions JOIN (SELECT DISTINCT orderID FROM transactions WHERE corporationID = 1) AS foo USING (orderID) </code></pre> <p>The query returns a list of corporations who have been involved in trades with corporation 1. That's exactly what I need, but it's very slow because MySQL can't use the corporationID index to determine the derived table. I understand that this is the case for all subqueries/derived tables in MySQL.</p> <p>I've also tried to query a collection of orderIDs separately and use a ridiculously large IN() clause (typhically 100 000+ IDs), but as it turns out MySQL has issues using indices on ridiculously large IN() clauses as well and as a result the query time does not improve.</p> <p>Are there any other options available, or have I exhausted them both?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload