Note that there are some explanatory texts on larger screens.

plurals
  1. POWould using Redis with Rails provide any performance benefit for this specific kind of queries
    primarykey
    data
    text
    <p>I don't know if this is the right place to ask question like this, but here it goes:</p> <p>I have an intranet-like Rails 3 application managing about 20k users which are in nested-set (preordered tree - <a href="http://en.wikipedia.org/wiki/Nested_set_model" rel="nofollow">http://en.wikipedia.org/wiki/Nested_set_model</a>). Those users enter stats (data, just plain numeric values). Entered stats are assigned to category (we call it Pointer) and a week number.</p> <p>Those data are further processed and computed to Results. Some are computed from users activity + result from some other category... etc. What user enters isn't always the same what he sees in reports.</p> <p>Those computations can be very tricky, some categories have very specific formulae.</p> <p>But the rest is just "give me sum of all entered values for this category for this user for this week/month/year".</p> <p>Problem is that those stats needs also to be summed for a subset of users under selected user (so it will basically return sum of all values for all users under the user, including self).</p> <p>This app is in production for 2 years and it is doing its job pretty well... but with more and more users it's also pretty slow when it comes to server-expensive reports, like "give me list of all users under myself and their statistics. One line for summed by their sub-group and one line for their personal stats"). Of course, users wants (and needs) their reports to be as actual as possible, 5 mins to reflect newly entered data is too much for them. And this specific report is their favorite :/ To stay realtime, we cannot do the high-intensive sqls directly... That would kill the server. So I'm computing them only once via background process and frontend just reads the results. Those sqls are hard to optimize and I'm glad I've moved from this approach... (caching is not an option. See below.)</p> <p>Current app goes like this:</p> <ul> <li><p>frontend: when user enters new data, it is saved to simple mysql table, like <code>[user_id, pointer_id, date, value]</code> and there is also insert to the queue.</p></li> <li><p>backend: then there is calc_daemon process, which every 5 seconds checks the queue for new "recompute requests". We pop the requests, determine what else needs to be recomputed along with it (pointers have dependencies... simplest case is: when you change week stats, we must recompute month and year stats...). It does this recomputation the easy way.. we select the data by customized per-pointer-different sqls generated by their classes.</p></li> <li>those computed results are then written back to mysql, but to partitioned tables (one table per year). One line in this table is like <code>[user_id, pointer_id, month_value, w1_value, w2_value, w3_value, w4_value]</code>. This way, the tables have ~500k records (I've basically reduced 5x # of records).</li> <li>when frontend needs those results it does simple sums on those partitioned data, with 2 joins (because of the nested set conds).</li> </ul> <p>The problem is that those simple sqls with sums, group by and join-on-the-subtree can take like 200ms each... just for a few records.. and we need to run a lot of these sqls... I think they are optimized the best they can, according to <code>explain</code>... but they are just too hard for it.</p> <p>So... The QUESTION:</p> <p>Can I rewrite this to use Redis (or other fast key-value store) and see any benefit from it when I'm using Ruby and Rails? As I see it, if I'll rewrite it to use redis, I'll have to run much more queries against it than I have to with mysql, and then perform the sum in ruby manually... so the performance can be hurt considerably... I'm not really sure if I could write all the possible queries I have now with redis... Loading the users in rails and then doing something like "redis, give me sum for users 1,2,3,4,5..." doesn't seem like right idea... But maybe there is some feature in redis that could make this simpler?)... Also the tree structure needs to be like nested set, i.e. it cannot have one entry in redis with list of all child-ids for some user (something like <code>children_for_user_10: [1,2,3]</code>) because the tree structure changes frequently... That's also the reason why I can't have those sums in those partitioned tables, because when the tree changes, I would have to recompute everything.. That's why I perform those sums realtime.)</p> <p>Or would you suggest me to rewrite this app to different language (java?) and to compute the results in memory instead? :) (I've tried to do it SOA-way but it failed on that I end up one way or another with XXX megabytes of data in ruby... especially when generating the reports... and gc just kills it...) (and a side effect is that one generating report blocks the whole rails app :/ )</p> <p>Suggestions are welcome.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COfrom another question http://stackoverflow.com/questions/4846243/redis-sum-of-scores-in-sorted-set it seems that summing in redis is a no go.
      singulars
    2. COThanks for the link. It seems like using key-value store with ruby and this kind of app demands it is just a no go... Well I could imagine building a service that handles and caches those data in memory and uses them for all the users (it would hold just the most recent data).. using redis like this would be simpler, but it would have to be fast (i.e. written in php (bleh) or java (uhoh)) :( .. Or I would have to choose e.g. MongoDB... that's another option. Should I update the question to include it? I would like some real-life experience before I do the rewrite... It would take week or so :(
      singulars
    3. COWe have used Redis in my workplace, and we have found that if you can simplify your data relations down to something Redis can interpret, you can have tremendous performance gains. That said, most of our queries that involve joins tended to indicate DB logic that was too much to unload onto Redis (We did try, but the added overhead of having to calculate much of what Postgres would normally do in our Rails app proved to make matters worse, not better :P )
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload