Note that there are some explanatory texts on larger screens.

plurals
  1. POrandom issues with python flask application on apache
    primarykey
    data
    text
    <p>I have an apache webserver which I have setup a website using flask using mod_wsgi. I am having a couple of issues which may or may not be related.</p> <ol> <li><p>With every call to a certain page (which runs a function performing heavy computation that takes over 2 seconds), the memory increases about 20 megabytes. My server starts out with about 350megabytes consumed by everything on the machine. The server has a total of 3,620megabytes shown in htop. After I reload this page many times, the total memory used by the server eventually starts topping out around 2,400 megabytes and stops increasing as much. After it gets to this level I haven't been able to get it consume enough memory to go into swap after hundreds of page reloads. Is this by design of flask or apache or python? To me, if there were some kind of caching mechanism, it didn't seem like memory accumulation would happen if the same URL is called every time. If I restart apache, the memory is released.</p></li> <li><p>Sometimes calls to this page result in called functions erroring out, even though they are all read only calls (not writing any data to the disk) and the query string is the same for every page.</p></li> <li><p>I have another page (calling another function which does much less computation), when called concurrently with other pages running on the web server, randomly errors out or the result (an image) comes back unexpectedly.</p></li> </ol> <p>Could issues 2 and 3 be related to issue 1? Could issues 2 and 3 be due to bad programming somehow or bad memory in the machine? I am able to reproduce the randomness by loading the same URL in about 40 firefox tabs and then choosing the "reload all tabs" option.</p> <p>What more information should be provided to get a better answer?</p> <p>I have tried placing</p> <pre><code>import gc gc.collect() </code></pre> <p>into my code.</p> <p>I do have</p> <pre><code> WSGIDaemonProcess website user=www-data group=www-data processes=2 threads=2 home=/web/website WSGIScriptAlias / /web/website/website.wsgi &lt;Directory /web/website&gt; WSGIProcessGroup website WSGIScriptReloading On WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all &lt;/Directory&gt; </code></pre> <p>in my /etc/apache2/sites-available/default file. It doesn't seem like the memory should grow that much if there are only a total of 4 threads being created, should there?</p> <p>UPDATE</p> <p>If I set processes=1 threads=4, then the seemingly random issues occur all the time when two requests are placed at once. One I set processes=4 threads=1, then the seemingly random issues don't happen. The rise in memory is still occurring though, and actually will now rise all the way to the max RAM of the system and start swapping.</p> <p>UPDATE</p> <p>Although I haven't gotten this runaway RAM consumption issue resolved, I didn't have problems for several months with my current application. Apparently it wasn't too popular, and after several days or so, apache may have been clearing out the RAM automatically or something.</p> <p>Now, I've made another application, which is fairly unrelated to the previous one. The previous application was generating about 1 megapixel images using matplotlib. My new application is generating 20 megapixel images and 1 megapixel images using matplotlib. The problem is monumentally larger now when 20 megapixel images are generated with the new application. After the entire swap space is filled up, something seems to get killed, and things work at a decent speed for a while while there is some RAM and swap space available, but is much slower to run when the RAM is consumed. Here are the processes running. I don't think that there are any extra zombie processes running.</p> <pre><code>$ ps -ef|grep apache root 3753 1 0 03:45 ? 00:00:02 /usr/sbin/apache2 -k start www-data 3756 3753 0 03:45 ? 00:00:00 /usr/sbin/apache2 -k start www-data 3759 3753 0 03:45 ? 00:02:06 /usr/sbin/apache2 -k start www-data 3762 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start www-data 3763 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start test 4644 4591 0 12:27 pts/1 00:00:00 tail -f /var/log/apache2/access.log www-data 4894 3753 0 21:34 ? 00:00:37 /usr/sbin/apache2 -k start www-data 4917 3753 2 22:33 ? 00:00:36 /usr/sbin/apache2 -k start www-data 4980 3753 1 22:46 ? 00:00:12 /usr/sbin/apache2 -k start </code></pre> <p>I am a little confused though when I look at htop because it shows a lot more processes than top or ps.</p> <p>UPDATE</p> <p>I have figured out that the memory leak is due to matplotlib (or the way I am using it), and not flask or apache, so the problems 2 and 3 I originally posted are indeed a separate issue from problem 1. Below is a basic function that I made to eliminate/reproduce the problem, interactively in ipython.</p> <pre><code>def BigComputation(): import cStringIO import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt #larger figure size causes more RAM to be used when savefig is run. #this function also uses some RAM that is never released automatically #if plt.close('all') is never run, but it is a small amount, #so it is hard to tell unless run BigComputation thousands of times. TheFigure=plt.figure(figsize=(250,8)) file_output = cStringIO.StringIO() #causes lots of RAM to be used, and never released automatically TheFigure.savefig(file_output) #releases all the RAM that is never released automatically plt.close('all') return None </code></pre> <p>The trick to getting rid of the RAM leak is to run</p> <pre><code>plt.close('all') </code></pre> <p>within BigComputation(), otherwise, BigComputation() will just keep accumulating RAM every time the function is called. I don't know if I am just using matplotlib inappropriately or have bad coding technique, but I really would think that once BigComputation() returns, it should release all the memory except any global objects or the objects it returned. It seems to me like matplotlib must be creating some global variables in an inappropriate way, because I have no idea what they are named.</p> <p>I guess where my question stands now is why do I need plt.close('all')? I also need to try the suggestions of Graham Dumpleton in order to further diagnose my apache configuration to see why I need to set threads=1 in apache to get the random errors to go away.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload