Note that there are some explanatory texts on larger screens.

plurals
  1. PORandom distribution of data
    primarykey
    data
    text
    <p>How do I distribute a small amount of data in a random order in a much larger volume of data?</p> <p>For example, I have several thousand lines of 'real' data, and I want to insert a dozen or two lines of control data in a random order throughout the 'real' data.</p> <p>Now I am not trying to ask how to use random number generators, I am asking a statistical question, I know how to generate random numbers, but my question is how do I ensure that this the data is inserted in a random order while at the same time being fairly evenly scattered through the file.</p> <p>If I just rely on generating random numbers there is a possibility (albeit a very small one) that all my control data, or at least clumps of it, will be inserted within a fairly narrow selection of 'real' data. What is the best way to stop this from happening?</p> <p>To phrase it another way, I want to insert control data throughout my real data without there being a way for a third party to calculate which rows are control and which are real. <hr /> Update: I have made this a 'community wiki' so if anyone wants to edit my question so it makes more sense then go right ahead. <hr /> Update: Let me try an example (I do not want to make this language or platform dependent as it is not a coding question, it is a statistical question).</p> <ul><li>I have 3000 rows of 'real' data (this amount will change from run to run, depending on the amount of data the user has).</li> <li>I have 20 rows of 'control' data (again, this will change depending on the number of control rows the user wants to use, anything from zero upwards).</li></ul> <p>I now want to insert these 20 'control' rows <em>roughly</em> after every 150 rows or 'real' data has been inserted (3000/20 = 150). However I do not want it to be as accurate as that as I do not want the control rows to be identifiable simply based on their location in the output data.</p> <p>Therefore I do not mind <em>some</em> of the 'control' rows being clumped together or for there to be <em>some</em> sections with very few or no 'control' rows at all, but generally I want the 'control' rows fairly evenly distributed throughout the data.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload