Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy does my SQL Server UPSERT code sometimes not block?
    text
    copied!<p>I have a table <code>ImportSourceMetadata</code> which I use to control an import batch process. It contains a PK column <code>SourceId</code> and a data column <code>LastCheckpoint</code>. The import batch process reads the <code>LastCheckpoint</code> for a given <code>SourceId</code>, performs some logic (on other tables), then updates the <code>LastCheckpoint</code> for that <code>SourceId</code> <em>or inserts it if it doesn't exist yet</em>.</p> <p>Multiple instances of the process run at the same time, usually with disjunct <code>SourceIds</code>, and I need high parallelity for those cases. However, it can happen that two processes are started for the same <code>SourceId</code>; in that case, I need the instances to block each other.</p> <p>Therefore, my code looks as follows:</p> <pre><code>BEGIN TRAN SET TRANSACTION ISOLATION LEVEL READ COMMITTED SELECT LastCheckpoint FROM ImportSourceMetadata WITH (UPDLOCK) WHERE SourceId = 'Source' -- Perform some processing -- UPSERT: if the SELECT above yielded no value, then INSERT INTO ImportSourceMetadata(SourceId, LastCheckpoint) VALUES ('Source', '2013-12-21') -- otherwise, we'd do this: UPDATE ImportSourceMetadata SET LastCheckpoint = '2013-12-21' WHERE SourceId = 'Source' COMMIT TRAN </code></pre> <p>I'm using the transaction to achieve atomicity, but I can only use READ COMMITTED isolation level (because of the parallelity requirements in the "Perform some processing" block). Therefore (and to avoid deadlocks), I'm including an UPDLOCK hint with the SELECT statement to achieve a "critical section" parameterized on the <code>SourceId</code>value.</p> <p>Now, this works quite well most of the time, but I've managed to trigger primary key violation errors with the INSERT statement when starting a lot of parallel processes for the same <code>SourceId</code>with an empty database. I cannot reliably reproduce this, however, and I don't understand <em>why</em> it doesn't work.</p> <p>I've found hints on the internet (e.g., <a href="https://stackoverflow.com/questions/7843733/confused-about-updlock-holdlock">here</a> and <a href="http://samsaffron.com/blog/archive/2007/04/04/14.aspx" rel="nofollow noreferrer">here, in a comment</a>) that I need to specify <code>WITH (UPDLOCK,HOLDLOCK)</code> (resp. <code>WITH (UPDLOCK,SERIALIZABLE)</code>) rather than just taking an UPDLOCK on the SELECT, but I don't really understand why that is. MSDN docs <a href="http://technet.microsoft.com/en-us/library/ms187373.aspx" rel="nofollow noreferrer">say</a>,</p> <blockquote> <p>UPDLOCK<br> Specifies that update locks are to be taken and held until the transaction completes.</p> </blockquote> <p>An update lock that is taken and held until the transaction completes should be enough to block a subsequent INSERT, and in fact, when I try it out in SQL Server Management Studio, it does indeed block my insert. However, in some rare cases, it seems to suddenly not work any more.</p> <p>So, why exactly is it that UPDLOCK is not enough, and why is it enough in 99% of my test runs (and when simulating it in SQL Server Management Studio)?</p> <p><strong>Update:</strong> I've now found I can reproduce the non-blocking behavior reliably by executing the code above in two different windows of SQL Server Management Studio simultaneously up to just before the INSERT, but <em>only</em> the first time after creating the database. After that (even though I deleted the contents of the <code>ImportSourceMetadata</code> table), the <code>SELECT WITH (UPDLOCK)</code> will indeed block and the code no longer fails. Indeed, in <code>sys.dm_tran_locks</code>, I can see a U-lock taken even though the row does not exist on subsequent test runs, but not on the first run after creating the table.</p> <p>This is a complete sample to show the difference in locks between a "newly created table" and an "old table":</p> <pre><code>DROP TABLE ImportSourceMetadata CREATE TABLE ImportSourceMetadata(SourceId nvarchar(50) PRIMARY KEY, LastCheckpoint datetime) BEGIN TRAN SET TRANSACTION ISOLATION LEVEL READ COMMITTED SELECT LastCheckpoint FROM ImportSourceMetadata WITH (UPDLOCK) WHERE SourceId='Source' SELECT * FROM sys.dm_tran_locks l JOIN sys.partitions p ON l.resource_associated_entity_id = p.hobt_id JOIN sys.objects o ON p.object_id = o.object_id INSERT INTO ImportSourceMetadata VALUES('Source', '2013-12-21') ROLLBACK TRAN BEGIN TRAN SET TRANSACTION ISOLATION LEVEL READ COMMITTED SELECT LastCheckpoint FROM ImportSourceMetadata WITH (UPDLOCK) WHERE SourceId='Source' SELECT * FROM sys.dm_tran_locks l JOIN sys.partitions p ON l.resource_associated_entity_id = p.hobt_id JOIN sys.objects o ON p.object_id = o.object_id ROLLBACK TRAN </code></pre> <p>On my system (with SQL Server 2012), the first query shows no locks on <code>ImportSourceMetadata</code>, but the second query shows a <code>KEY</code> lock on <code>ImportSourceMetadata</code>.</p> <p>In other words, <code>HOLDLOCK</code> is indeed required, but only if the table was freshly created. Why's that?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload