StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POMaximizing apache server instances with large mod_wsgi application
primarykey
Id
10017645
data
AcceptedAnswerId
10020054
AnswerCount
1
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2012-04-04T19:06:44.973
FavoriteCount
0
LastActivityDate
2012-04-04T22:17:08.380
LastEditDate
LastEditorUserId
0
OwnerUserId
1309534
ParentId
0
PostTypeId
1
Score
2
ViewCount
159
LastEditorDisplayName
text
Body
I'm writing a Oracle of Bacon type website that involves a breadth first search on a very large directed graph (>5 million nodes with an average of perhaps 30 outbound edges each). This is also essentially all the site will do, aside from display a few mostly text pages (how it works, contact info, etc.). I currently have a test implementation running in Python, but even using Python arrays to efficiently represent the data, it takes >1.5gb of RAM to hold the whole thing. Clearly Python is the wrong language for a low-level algorithmic problem like this, so I plan to rewrite most of it in C using the Python/C bindings. I estimate that this'll take about 300 mb of RAM. Based on my current configuration, this will run through mod_wsgi in apache 2.2.14, which is set to use mpm_worker_module. Each child apache server will then load up the whole python setup (which loads the C extension) thus using 300 mb, and I only have 4gb of RAM. This'll take time to load and it seems like it'd potentially keep the number of server instances lower than it could otherwise be. If I understand correctly, data-heavy (and not client-interaction-heavy) tasks like this would typically get divorced from the server by setting up an SQL database or something of the sort that all the server processes could then query. But I don't know of a database framework that'd fit my needs. So, how to proceed? Is it worth trying to set up a database divorced from the webserver, or in some other way move the application a step farther out than mod_wsgi, in order to maybe get a few more server instances running? If so, how could this be done? My first impression is that the database, and not the server, is always going to be the limiting factor. It looks like the typical Apache mpm_worker_module configuration has ServerLimit 16 anyways, so I'd probably only get a few more servers. And if I did divorce the database from the server I'd have to have some way to run multiple instances of the database as well (I already know that just one probably won't cut it for the traffic levels I want to support) and make them play nice with the server. So I've perhaps mostly answered my own question, but this is a kind of odd situation so I figured it'd be worth seeing if anyone's got a firmer handle on it. Anything I'm missing? Does this implementation make sense? Thanks in advance! Technical details: it's a Django website that I'm going to serve using Apache 2.2.14 on Ubuntu 10.4. 
Tags
<python><database><django><apache><mod-wsgi>
Title
Maximizing apache server instances with large mod_wsgi application
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USTomNysetvold
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POMaximizing apache server instances with large mod_wsgi application
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POMaximizing apache server instances with large mod_wsgi application
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.