Note that there are some explanatory texts on larger screens.

plurals
  1. POparallelizable hash function for data integrity check
    primarykey
    data
    text
    <p>I need to check the data integrity in the following situation: the data is writen to storage in chunks of different size (for each chunk we know its offset in the final file). However, the chunks come in arbitrary order and in multiple threads. They are read back from the storage in a completely different order (and the chunks have different size). </p> <p>What I currently have in mind is the following:</p> <pre><code> #define MODEST_PRIME 1021 unsigned char checkbuf[MODEST_PRIME]; void check_function(unsigned char *chunk, size_t offset, size_t length, unsigned char *result) { size_t i; for(i=0; i&lt;length; i++) result[(i+offset)%MODEST_PRIME]^=chunk[i]; } </code></pre> <p>This seems to offer protection from changing of any single byte and (to some extent) against swapping of the chunks (it is unlikely that the distance between the swapping blocks will be divisible by the large prime). The results of this function for different chunks can be just xored together, so it is completely parallelizable.</p> <p>However, this function looks highly unsofisticated in comparison to md5 sum, or any other modern hash function. But as far as I understand the computation of md5 sum or sha-1 sum can not be done in arbitrary order. </p> <p>Well, the question is, do we have any better solution that </p> <ol> <li>Can be computed in an arbitrary order if we know the size and offset of the chunks (like the simple algorithm I've outlined above).</li> <li>Can offer a data integrity check at least comparable to that of md5 sum.</li> </ol>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload