Note that there are some explanatory texts on larger screens.

plurals
  1. POCreating lists from huge object taking more time than creating it from huge file
    primarykey
    data
    text
    <p>I was working with a huge file today when I found this. My mind = blown, till now I always used to create a single in memory object from file and used to do all my stuff on that object. </p> <p>Consider this use case where I am creating a huge file (for clarity sake) and then reading it into 2 different lists. </p> <pre><code>import csv import time TEMP_FILE_NAME = '/tmp/foo.csv' def write_huge_file(): with open(TEMP_FILE_NAME, 'wb') as f: writer = csv.writer(f) writer.writerows((((i, i + 100) for i in xrange(29999999)))) def get_file_iterator(): with open(TEMP_FILE_NAME, 'rb') as f: reader = csv.reader(f, delimiter=',') for row in reader: yield row def make_2_list_from_object(): file_iterator = get_file_iterator() main_list = [(i, j) for i, j in file_iterator] list1 = [i[0] for i in main_list] list2 = [i[1] for i in main_list] def make_2_list_from_file(): list1 = list(i[0] for i in get_file_iterator()) list2 = list(i[1] for i in get_file_iterator()) if __name__ == '__main__': #write_huge_file() # Uncomment this to write the file once print 'wrote_file' a = time.time() make_2_list_from_file() b = time.time() print b-a make_2_list_from_object() c = time.time() print 'Time taken using file: ', str(b-a) print 'Time taken using object: ', str(c-b) </code></pre> <p>Now when I run it I get this output:</p> <pre><code>Time taken using file: 49.212211132 s Time taken using object: 1018.5052530766 s </code></pre> <p>Can anyone please explain this to me? I am thinking its because of python's swap memory usage when it runs out of RAM.</p> <p>Also please note that I had 4 Gigs of RAM while running this program. You can increase the number of lines in written file if you have more RAM to reproduce this.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload