Note that there are some explanatory texts on larger screens.

plurals
  1. POCannot grok python multiprocessing
    primarykey
    data
    text
    <p>I need to run a function for the each of the elements of my database. </p> <p>When I try the following:</p> <pre><code>from multiprocessing import Pool from pymongo import Connection def foo(): ... connection1 = Connection('127.0.0.1', 27017) db1 = connection1.data my_pool = Pool(6) my_pool.map(foo, db1.index.find()) </code></pre> <p>I'm getting the following error: </p> <blockquote> <p>Job 1, 'python myscript.py ' terminated by signal SIGKILL (Forced quit)</p> </blockquote> <p>Which is, I think, caused by <code>db1.index.find()</code> eating all the available ram while trying to return millions of database elements... </p> <p>How should I modify my code for it to work? </p> <p>Some logs are here:</p> <pre><code>dmesg | tail -500 | grep memory [177886.768927] Out of memory: Kill process 3063 (python) score 683 or sacrifice child [177891.001379] [&lt;ffffffff8110e51a&gt;] out_of_memory+0xfa/0x250 [177891.021362] Out of memory: Kill process 3063 (python) score 684 or sacrifice child [177891.025399] [&lt;ffffffff8110e51a&gt;] out_of_memory+0xfa/0x250 </code></pre> <p>The actual function below:</p> <pre><code>def create_barrel(item): connection = Connection('127.0.0.1', 27017) db = connection.data print db.index.count() barrel = [] fls = [] if 'name' in item.keys(): barrel.append(WhitespaceTokenizer().tokenize(item['name'])) name = item['name'] elif 'name.utf-8' in item.keys(): barrel.append(WhitespaceTokenizer().tokenize(item['name.utf-8'])) name = item['name.utf-8'] else: print item.keys() if 'files' in item.keys(): for file in item['files']: if 'path' in file.keys(): barrel.append(WhitespaceTokenizer().tokenize(" ".join(file['path']))) fls.append(("\\".join(file['path']),file['length'])) elif 'path.utf-8' in file.keys(): barrel.append(WhitespaceTokenizer().tokenize(" ".join(file['path.utf-8']))) fls.append(("\\".join(file['path.utf-8']),file['length'])) else: print file barrel.append(WhitespaceTokenizer().tokenize(file)) if len(fls) &lt; 1: fls.append((name,item['length'])) barrel = sum(barrel,[]) for s in barrel: vs = re.findall("\d[\d|\.]*\d", s) #versions i.e. numbes such as 4.2.7500 b0 = [] for s in barrel: b0.append(re.split("[" + string.punctuation + "]", s)) b1 = filter(lambda x: x not in string.punctuation, sum(b0,[])) flag = True while flag: bb = [] flag = False for bt in b1: if bt[0] in string.punctuation: bb.append(bt[1:]) flag = True elif bt[-1] in string.punctuation: bb.append(bt[:-1]) flag = True else: bb.append(bt) b1 = bb b2 = b1 + barrel + vs b3 = list(set(b2)) b4 = map(lambda x: x.lower(), b3) b_final = {} b_final['_id'] = item['_id'] b_final['tags'] = b4 b_final['name'] = name b_final['files'] = fls print db.barrels.insert(b_final) </code></pre> <p>I've noticed interesting thing. Then I press ctrl+c to stop process I'm getting the following:</p> <pre><code>python index2barrel.py Traceback (most recent call last): File "index2barrel.py", line 83, in &lt;module&gt; my_pool.map(create_barrel, db1.index.find, 6) File "/usr/lib/python2.7/multiprocessing/pool.py", line 227, in map return self.map_async(func, iterable, chunksize).get() File "/usr/lib/python2.7/multiprocessing/pool.py", line 280, in map_async iterable = list(iterable) TypeError: 'instancemethod' object is not iterable </code></pre> <p>I mean, why multiprocessing is trying to convert somethin to the list? Isn't it the source of the problem? </p> <p>from the stack trace:</p> <pre><code>brk(0x231ccf000) = 0x231ccf000 futex(0x1abb150, FUTEX_WAKE_PRIVATE, 1) = 1 sendto(3, "+\0\0\0\260\263\355\356\0\0\0\0\325\7\0\0\0\0\0\0data.index\0\0"..., 43, 0, NULL, 0) = 43 recvfrom(3, "Some text from my database."..., 491663, 0, NULL, NULL) = 491663 ... [manymany times] brk(0x2320d5000) = 0x2320d5000 .... manymany times </code></pre> <p>The above sample goes and goes in strace output and for some reason strace -o logfile python myscript.py does not halt. It just eats all the available ram and writes in log file. </p> <p>UPDATE. Using imap instead of map solved my problem.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload