Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>If you know the number of lines in your file and if you're randomising complete rows, you can just randomise by line number and then read that selected row. Just select a random line via the <a href="http://download.oracle.com/javase/6/docs/api/java/util/Random.html" rel="nofollow noreferrer">Random</a> class and store the list of random numbers, so you don't pick one twice.</p> <pre><code>BufferedReader reader = new BufferedReader(new FileReader(new File("file.cvs"))); BufferedWriter chosen = new BufferedWriter(new FileWriter(new File("chosen.cvs"))); BufferedWriter notChosen = new BufferedWriter(new FileWriter(new File("notChosen.cvs"))); int numChosenRows = 10000; long numLines = 1000000000; Set&lt;Long&gt; chosenRows = new HashSet&lt;Long&gt;(numChosenRows+1, 1); for(int i = 0; i &lt; numChosenRows; i++) { while(!chosenRows.add(nextLong(numLines))) { // add returns false if the value already exists in the Set } } String line; for(long lineNo = 0; (line = reader.readLine()) != null; lineNo++){ if(chosenRows.contains(lineNo)){ // Do nothing for the moment } else { notChosen.write(line); } } // Randomise the set of chosen rows // Use RandomAccessFile to write the rows in that order </code></pre> <p>See <a href="https://stackoverflow.com/questions/2546078/java-random-long-number-in-0-x-n-range/2546186#2546186">this answer</a> for the nextLong method, which produces a random long scaled to a particular size.</p> <p><strong>Edit:</strong> As most people, I overlooked the requirement for writing the randomly selected lines in a random order. I'm presuming that <a href="http://download.oracle.com/javase/6/docs/api/java/io/RandomAccessFile.html" rel="nofollow noreferrer">RandomAccessFile</a> would help with that. Just randomise the List with the chosen rows and access them in that order. As for the unchosen ones, I edited the code above to simply ignore the chosen ones. </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload