StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POrandom issues with python flask application on apache
primarykey
Id
16428132
data
AcceptedAnswerId
0
AnswerCount
1
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2013-05-07T20:37:15.030
FavoriteCount
1
LastActivityDate
2013-11-06T16:32:33.763
LastEditDate
2013-11-06T16:32:33.763
LastEditorUserId
1748155
OwnerUserId
1748155
ParentId
0
PostTypeId
1
Score
3
ViewCount
1011
LastEditorDisplayName
text
Body
I have an apache webserver which I have setup a website using flask using mod_wsgi. I am having a couple of issues which may or may not be related. <ol> <li>With every call to a certain page (which runs a function performing heavy computation that takes over 2 seconds), the memory increases about 20 megabytes. My server starts out with about 350megabytes consumed by everything on the machine. The server has a total of 3,620megabytes shown in htop. After I reload this page many times, the total memory used by the server eventually starts topping out around 2,400 megabytes and stops increasing as much. After it gets to this level I haven't been able to get it consume enough memory to go into swap after hundreds of page reloads. Is this by design of flask or apache or python? To me, if there were some kind of caching mechanism, it didn't seem like memory accumulation would happen if the same URL is called every time. If I restart apache, the memory is released.</li> <li>Sometimes calls to this page result in called functions erroring out, even though they are all read only calls (not writing any data to the disk) and the query string is the same for every page.</li> <li>I have another page (calling another function which does much less computation), when called concurrently with other pages running on the web server, randomly errors out or the result (an image) comes back unexpectedly.</li> </ol> Could issues 2 and 3 be related to issue 1? Could issues 2 and 3 be due to bad programming somehow or bad memory in the machine? I am able to reproduce the randomness by loading the same URL in about 40 firefox tabs and then choosing the "reload all tabs" option. What more information should be provided to get a better answer? I have tried placing <pre><code>import gc gc.collect() </code></pre> into my code. I do have <pre><code> WSGIDaemonProcess website user=www-data group=www-data processes=2 threads=2 home=/web/website WSGIScriptAlias / /web/website/website.wsgi <Directory /web/website> WSGIProcessGroup website WSGIScriptReloading On WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all </Directory> </code></pre> in my /etc/apache2/sites-available/default file. It doesn't seem like the memory should grow that much if there are only a total of 4 threads being created, should there? UPDATE If I set processes=1 threads=4, then the seemingly random issues occur all the time when two requests are placed at once. One I set processes=4 threads=1, then the seemingly random issues don't happen. The rise in memory is still occurring though, and actually will now rise all the way to the max RAM of the system and start swapping. UPDATE Although I haven't gotten this runaway RAM consumption issue resolved, I didn't have problems for several months with my current application. Apparently it wasn't too popular, and after several days or so, apache may have been clearing out the RAM automatically or something. Now, I've made another application, which is fairly unrelated to the previous one. The previous application was generating about 1 megapixel images using matplotlib. My new application is generating 20 megapixel images and 1 megapixel images using matplotlib. The problem is monumentally larger now when 20 megapixel images are generated with the new application. After the entire swap space is filled up, something seems to get killed, and things work at a decent speed for a while while there is some RAM and swap space available, but is much slower to run when the RAM is consumed. Here are the processes running. I don't think that there are any extra zombie processes running. <pre><code>$ ps -ef|grep apache root 3753 1 0 03:45 ? 00:00:02 /usr/sbin/apache2 -k start www-data 3756 3753 0 03:45 ? 00:00:00 /usr/sbin/apache2 -k start www-data 3759 3753 0 03:45 ? 00:02:06 /usr/sbin/apache2 -k start www-data 3762 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start www-data 3763 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start test 4644 4591 0 12:27 pts/1 00:00:00 tail -f /var/log/apache2/access.log www-data 4894 3753 0 21:34 ? 00:00:37 /usr/sbin/apache2 -k start www-data 4917 3753 2 22:33 ? 00:00:36 /usr/sbin/apache2 -k start www-data 4980 3753 1 22:46 ? 00:00:12 /usr/sbin/apache2 -k start </code></pre> I am a little confused though when I look at htop because it shows a lot more processes than top or ps. UPDATE I have figured out that the memory leak is due to matplotlib (or the way I am using it), and not flask or apache, so the problems 2 and 3 I originally posted are indeed a separate issue from problem 1. Below is a basic function that I made to eliminate/reproduce the problem, interactively in ipython. <pre><code>def BigComputation(): import cStringIO import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt #larger figure size causes more RAM to be used when savefig is run. #this function also uses some RAM that is never released automatically #if plt.close('all') is never run, but it is a small amount, #so it is hard to tell unless run BigComputation thousands of times. TheFigure=plt.figure(figsize=(250,8)) file_output = cStringIO.StringIO() #causes lots of RAM to be used, and never released automatically TheFigure.savefig(file_output) #releases all the RAM that is never released automatically plt.close('all') return None </code></pre> The trick to getting rid of the RAM leak is to run <pre><code>plt.close('all') </code></pre> within BigComputation(), otherwise, BigComputation() will just keep accumulating RAM every time the function is called. I don't know if I am just using matplotlib inappropriately or have bad coding technique, but I really would think that once BigComputation() returns, it should release all the memory except any global objects or the objects it returned. It seems to me like matplotlib must be creating some global variables in an inappropriate way, because I have no idea what they are named. I guess where my question stands now is why do I need plt.close('all')? I also need to try the suggestions of Graham Dumpleton in order to further diagnose my apache configuration to see why I need to set threads=1 in apache to get the random errors to go away.
Tags
<python><apache><memory><matplotlib><flask>
Title
random issues with python flask application on apache
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USuser1748155
UserOwnerUserId
1. USuser1748155
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POrandom issues with python flask application on apache
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POrandom issues with python flask application on apache
 UserUserId
 USShobit
 VoteTypeVoteTypeId
 VTFavorite
3. VO
 singulars
 PostPostId
 POrandom issues with python flask application on apache
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COI've previously experienced issues with Apache taking longer than usual to kill zombie processes. When you're doing your tests with top do you see large numbers of extra zombie processes? Also - with 1), Apache may be limiting the numbers of requests causing the memory to stop increasing and extra requests, in 2) and 3) to be rejected or returned unexpectedly
 singulars
 PostPostId
 POrandom issues with python flask application on apache
 UserUserId
 USEwan

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.