Note that there are some explanatory texts on larger screens.

plurals
  1. POCan INSERT <table> (x) VALUES (@x) WHERE NOT EXISTS ( SELECT * FROM <table> WHERE x = @x) cause duplicates?
    primarykey
    data
    text
    <p>While browsing SO I found the <a href="https://stackoverflow.com/questions/2282513/which-is-faster-exists-before-or-after-the-insert/">following</a> Question/Discussion about the "best" approach for inserting records that don't exist yet. One of the statements that struck me was one of [Remus Rusanu] stating:</p> <blockquote> <p>Both variants are incorrect. You will insert pairs of duplicate @value1, @value2, guaranteed. </p> </blockquote> <p>Although I do agree about this for the syntax where the check is 'separated' from the INSERT (and no explicit locking/transaction mgmt is present); I'm having a hard time understanding why and when this would be true for the other proposed syntax that looks like this</p> <pre><code>INSERT INTO mytable (x) SELECT @x WHERE NOT EXISTS (SELECT * FROM mytable WHERE x = @x); </code></pre> <p>I do NOT want to start (another) what's best/fastest discussion, nor do I think the syntax can 'replace' a unique index/constraint (or PK) but I really need to know in what situations this construction could cause doubles as I've been using this syntax in the past and wonder if it is unsafe to continue doing so in the future.</p> <p>What I think that happens is that the INSERT &amp; SELECT are both in the same (implicit) transaction. The query will take an IX lock on the related record (key) and not release it until the entire query has finished, thus only AFTER the record has been inserted. This lock blocks all other connections from making the same INSERT as they can't get a lock themselves until after our insert has finished; only then they get the lock and will start verifying for themselves if the record already exists or not.</p> <p>As IMHO the best way to find out is by testing, I've been running the following code for a while on my laptop:</p> <p>Create table</p> <pre><code>CREATE TABLE t_test (x int NOT NULL PRIMARY KEY (x)) </code></pre> <p>Run below on many, many connections in parallel)</p> <pre><code>SET NOCOUNT ON WHILE 1 = 1 BEGIN INSERT t_test (x) SELECT x = DatePart(ms, CURRENT_TIMESTAMP) WHERE NOT EXISTS ( SELECT * FROM t_test old WHERE old.x = DatePart(ms, CURRENT_TIMESTAMP) ) END </code></pre> <p>So far the only things to note are:</p> <ul> <li>No errors encountered (yet)</li> <li>CPU is running quite hot =)</li> <li>table held 300 records quickly (due to 3ms 'precision' of datetime) after that no actual inserts are happening any more, as expected.</li> </ul> <p>UPDATE:</p> <p>Turns out my example above is not doing what I intended it to do. Instead of multiple connections trying to insert the same record simultaneously I simply had it not-inserting already existing records after the first second. As it probably took about a second to copy-paste &amp; execute the query on the next connection there was never a danger of duplicates. I'll be wearing my donkey-ears for the remainder of the day...</p> <p>Anyway, I've adapted the test to be more in line of the matter at hand (using the same table)</p> <pre><code>SET NOCOUNT ON DECLARE @midnight datetime SELECT @midnight = Convert(datetime, Convert(varchar, CURRENT_TIMESTAMP, 106), 106) WHILE 1 = 1 BEGIN INSERT t_test (x) SELECT x = DateDiff(ms, @midnight, CURRENT_TIMESTAMP) WHERE NOT EXISTS ( SELECT * FROM t_test old WHERE old.x = DateDiff(ms, @midnight, CURRENT_TIMESTAMP)) END </code></pre> <p>And lo &amp; behold, the output window now holds plenty of errors along the lines of</p> <blockquote> <p>Msg 2627, Level 14, State 1, Line 8 Violation of PRIMARY KEY constraint 'PK__t_test__3BD019E521C3B7EE'. Cannot insert >duplicate key in object 'dbo.t_test'. The duplicate key value is (57581873).</p> </blockquote> <p>FYI: As pointed out by Andomar, adding a HOLDLOCK and/or SERIALIZABLE hint indeed 'solves' the problem but then turns out to be causing lots of deadlocks... which isn't great but not unexpected either when I think it through.</p> <p>Guess I have quite a bit of code review to do... </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload