Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>checksum_agg appears to simply add the results of binary_checksum together for all rows. Although each row has changed, the sum of the two checksums has not (i.e. 17+32 = 16+33). This is not really the norm for checking for updates, but the recommendations I can come up with are as follows:</p> <ol> <li>Instead of using <code>checksum_agg</code>, concatenate the checksums into a delimited string, and compare strings, along the lines of <code>SELECT binary_checksum(*) + ',' FROM MyTable FOR XML PATH('')</code>. Much longer string to check and to store, but there will be much less chance of a false positive comparison.</li> <li>Instead of using the built-in checksum routine, use HASHBYTES to calculate MD5 checksums in 8000 byte blocks, and xor the results together. This will give you a much more resilient checksum, although still not bullet-proof (i.e. it is still possible to get false matches, but very much less likely). I'll paste the HASHBYTES demo code that I wrote below.</li> <li>The last option, and absolute last resort, is to actually store the table table in XML format, and compare that. This is really the only way you can be absolutely certain of no false matches, but is not scalable and involves storing and comparing large amounts of data.</li> </ol> <p>Every approach, including the one you started with, has pros and cons, with varying degrees of data size and processing requirements against accuracy. Depending on what level of accuracy you require, use the appropriate option. The only way to get 100% accuracy is to store all of the table data. </p> <p>Alternatively, you can add a date_modified field to each table, which is set to GetDate() using after insert and update triggers. You can do <code>SELECT COUNT(*) FROM #test WHERE date_modified &gt; @date_last_checked</code>. This is a more common way of checking for updates. The downside of this one is that deletions cannot be tracked. </p> <p>Another approach is to create a modified table, with table_name (VARCHAR) and is_modified (BIT) fields, containing one row for each table you wish to track. Using insert, update and delete triggers, the flag against the relevant table is set to True. When you run your schedule, you check and reset the is_modified flag (in the same transaction) - along the lines of <code>SELECT @is_modified = is_modified, is_modified = 0 FROM tblModified</code></p> <p>The following script generates three result sets, each corresponding with the numbered list earlier in this response. I have commented which output correspond with which option, just before the SELECT statement. To see how the output was derived, you can work backwards through the code.</p> <pre><code>-- Create the test table and populate it CREATE TABLE #Test ( f1 INT, f2 INT ) INSERT INTO #Test VALUES(1, 1) INSERT INTO #Test VALUES(2, 0) INSERT INTO #Test VALUES(2, 1) /******************* OPTION 1 *******************/ SELECT CAST(binary_checksum(*) AS VARCHAR) + ',' FROM #test FOR XML PATH('') -- Declaration: Input and output MD5 checksums (@in and @out), input string (@input), and counter (@i) DECLARE @in VARBINARY(16), @out VARBINARY(16), @input VARCHAR(MAX), @i INT -- Initialize @input string as the XML dump of the table -- Use this as your comparison string if you choose to not use the MD5 checksum SET @input = (SELECT * FROM #Test FOR XML RAW) /******************* OPTION 3 *******************/ SELECT @input -- Initialise counter and output MD5. SET @i = 1 SET @out = 0x00000000000000000000000000000000 WHILE @i &lt;= LEN(@input) BEGIN -- calculate MD5 for this batch SET @in = HASHBYTES('MD5', SUBSTRING(@input, @i, CASE WHEN LEN(@input) - @i &gt; 8000 THEN 8000 ELSE LEN(@input) - @i END)) -- xor the results with the output SET @out = CAST(CAST(SUBSTRING(@in, 1, 4) AS INT) ^ CAST(SUBSTRING(@out, 1, 4) AS INT) AS VARBINARY(4)) + CAST(CAST(SUBSTRING(@in, 5, 4) AS INT) ^ CAST(SUBSTRING(@out, 5, 4) AS INT) AS VARBINARY(4)) + CAST(CAST(SUBSTRING(@in, 9, 4) AS INT) ^ CAST(SUBSTRING(@out, 9, 4) AS INT) AS VARBINARY(4)) + CAST(CAST(SUBSTRING(@in, 13, 4) AS INT) ^ CAST(SUBSTRING(@out, 13, 4) AS INT) AS VARBINARY(4)) SET @i = @i + 8000 END /******************* OPTION 2 *******************/ SELECT @out </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload