Note that there are some explanatory texts on larger screens.

plurals
  1. POComparing all individual records from two VERY similar tables with lots of data
    primarykey
    data
    text
    <p>I am developing a tool for enhancing part of the regression testing process.</p> <p><strong>What I need to achieve:</strong></p> <p>I have two tables both with about 7 thousand records. One is a snapshot from production, the other one represents the exactly same records that will be implemented after the update. </p> <p>Each record contains 50-200 fields of varying length. Two (field_1 and field_2) of these fields serve as identifiers to distinguish the records. Field 1 is used ti pair a record from production to a record from update, field 2 is used to identify the form of the message.</p> <p>For record from production there is always one and only one record with equal field_1 and field_2 values.</p> <p>I would like to write either a query, or code that will ( in few seconds at most ) return an array in the following or similar form:</p> <pre><code>Array ( [0] =&gt; Array ( [Production] =&gt; Array ( [id] =&gt; 83 [field1] =&gt; value1 [field2] =&gt; value2 [field3] =&gt; differing_val [field4] =&gt; value3 ..... ) [Update] =&gt; Array ( [id] =&gt; 1 [field1] =&gt; value1 [field2] =&gt; value2 [field3] =&gt; some_other_different_val [field4] =&gt; value3 ..... ) ) ) </code></pre> <p><strong>The problem:</strong></p> <ol> <li>I tried to take all records and build this dynamically ( execution didn't even finish after 10 minutes ), with some limit and pagination, it was better</li> <li>So I wrote this sql statement, that would do part of the job for me, but even with limit for some values it takes 5-7 minutes to finish (and it only does part of the job for me)</li> </ol> <p>Here is my sql query:</p> <pre><code>SELECT production_records.* FROM production_records WHERE production_records.token_2 = ( SELECT update_records.token_2 FROM update_records WHERE update_records.token_1 = production_records.token_1 AND update_records.token_2 = production_records.token_2 AND update_records.token_130 &lt;&gt; production_records.token_130 ) ORDER BY production_records.token_1 DESC LIMIT 6 </code></pre> <p>Here I know there are 6 differences, therefore the limit and still it takes 406 seconds.</p> <p><strong>Question:</strong></p> <p>Do you see anything, that could be done with the data, or sql to make it run faster? I'm not that good with sql yet, so I guess there lies the problem.</p> <p>I have full control over the data, so I can change anything, didn't come up yet with a better than O(n^2) solution.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload