Note that there are some explanatory texts on larger screens.

plurals
  1. POHelp optimize this query with a huge IN statement
    primarykey
    data
    text
    <p>I have an insert that uses a condition checking for a NOT IN. There are about 230k rows in the NOT IN subquery. </p> <pre><code>INSERT INTO Validate.ItemError (ItemId, ErrorId, DateCreated) ( SELECT ItemId, 10, GetUTCDate() FROM Validate.Item INNER JOIN Refresh.Company ON Validate.Item.IMCompanyId = Refresh.Company.IMCompanyId WHERE Refresh.Company.CompanyId = 14 AND ( IMAccountId IS NULL OR NOT IMAccountId IN ( SELECT RA.IMAccountId FROM Refresh.Account RA INNER JOIN Refresh.BalancePool BP ON RA.BalancePoolId = BP.BalancePoolId WHERE BP.CompanyId = 14 ) ) ) </code></pre> <p>When I run this as-is it takes about 30+ minutes (yikes!). The number of values in the Validate.Item table could be anywhere from 150 rows to over 200k, so you can see how this might be a pain.</p> <p>There are indices on all the relevant fields in the tables, and none are excessively fragmented.</p> <p>My first thought was to do this in pieces, and throw it into a WHILE loop:</p> <pre><code>DECLARE @StartId int, @EndId int, @MaxId int SELECT @MaxId = MAX(AccountId) FROM Refresh.Account SET @StartId = 1 SET @EndId = 1000 WHILE (@StartId &lt; @MaxId) BEGIN INSERT INTO Validate.ItemError (ItemId, ErrorId, DateCreated) ( SELECT ItemId, 10, GetUTCDate() FROM Validate.Item INNER JOIN Refresh.Company ON Validate.Item.IMCompanyId = Refresh.Company.IMCompanyId WHERE Refresh.Company.CompanyId = 14 AND ( IMAccountId IS NULL OR NOT IMAccountId IN ( SELECT RA.IMAccountId FROM Refresh.Account RA INNER JOIN Refresh.BalancePool BP ON RA.BalancePoolId = BP.BalancePoolId WHERE BP.CompanyId = 14 AND RA.AccountId BETWEEN @StartId AND @EndId ) ) ) SET @StartId = @StartId + 1000 SET @EndId = @EndId + 1000 END </code></pre> <p>Doing it this way nets me a time of about a minute per loop; multiply that by 230 times and we have an even more ridiculous number.</p> <p>Please tell me you guys have a better idea how to optimize this. Without this one query, the entire process only takes 8 seconds; it's just the sheer size of the Refresh.Account table that throws everything into chaos.</p> <p>TIA!</p> <p>Valkyrie</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload