Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If you need to store a lot of data, then you need to store a lot of data. If you are like most, you probably won't run into that problem sooner than you have the cash to solve it. In other words, you are probably assuming you'll have more traffic and data than you'll get, at least in the short-term. So I doubt this is an issue, even though it is a good sign that you are thinking about it now rather than later.</p> <p>As I mentioned in my comment below, the easiest solution is to have a tie table with a row for each side of the friend relationship (a <code>has_many :friends, through: :facebook_friend_relationships, class_name: 'FacebookFriend'</code> on FacebookFriend, per the design mentioned below). But your question seemed to be about how to reduce the number of records, so that is what the remainder of the answer will address.</p> <p>If you have to store in the DB and you know for sure that you will absolutely have every FB user on the planet hitting your site because it is so awesome, but they won't all hit at once, then if you are limited in storage, you may want to use a LRU algorithm (remove the least recently used records) possibly with timed expiration also. You could just have a cron job that does a query on the DB then deletes old/unused records to do this. Wouldn't be perfect, but it would be a simple solution.</p> <p>You could also archive older data rather than throw it away. So, frequently used data could stay in the table of active users, and then you might offload older data to another table or even another database (and you might see the apartment and second_base gems for that). However, once you get to the size, you're probably looking at a number of other architectural solutions that have much less to do with ActiveRecord models/associations or schema design. Though it pays to plan ahead, I wouldn't worry about that excessively until you are sure that the application will get enough users to invest the time in that.</p> <p>Even though ActiveRecord has some caching, you could just avoid the DB and cache friends in memory yourself in the beginning for speed, especially if you don't yet have many users, which you probably don't yet. If you think you'll run out of memory because of the high number of users, LRU might be a good option here also, and <a href="https://github.com/SamSaffron/lru_redux" rel="nofollow">lru_redux</a> looks interesting. Again, you might want to time the cache also so expires and re-gets friends when the cache expires. Even just storing the results in the user session may be adequate, i.e. in the controller action method, just do <code>@friends ||= Something.find_friends(fb_user_id)</code>, and the latter is what most might do as a first shot at it while you're getting started.</p> <p>If you use ActiveRecord, in your query in the controller (or on the association in the model) consider using <code>include:</code> to avoid n+1 queries. That will speed up things.</p> <p>For the schema design, maybe:</p> <ul> <li>User - users table with email and authN info. Look at the Devise gem.</li> <li>FacebookUser - info about the Facebook user.</li> <li>FacebookFriendRelationship - a tie model with (id and) two columns, one for one FacebookUser id and one for the other.</li> </ul> <p>By separating the authN info (User) from the FB data (FacebookUser and FacebookFriendRelationship), you make it easier to have other social media accounts, etc. each with information specific to those accounts in other tables.</p> <p>The complexity comes in FacebookUser's relationship with friends if the goal is to minimize rows in the relationship table. To half the number of rows, you'd have a single row for a relationship where the id of FacebookUser could be in either foreign key column. Either the user has a friend or is a friend, so you could have two <code>has_many :through</code> associations on FacebookFriend that each use a different foreign key in FacebookFriendRelationship. Or you could do HABTM without the model and use foreign_key and association_foreign_key options in each association. Either way, you could add a method to add both associations together (because they are arrays). Instead, you could use custom SQL in a single has_many if you didn't care about having to use ActiveRecord to remove associations the normal way. However, per your comments, I think you want to avoid this complexity, and I agree with you, unless you really must limit the number of relationship rows. However, it isn't the number of tie table rows that will eat the data, it is going to be all of the user info you keep in the FacebookFriends table.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    1. COThanks for your answer. You're right that never have that many users, I just wanted to learn the best way to do things. If I go with the schema you suggested, each time the user logs in I'd need to update the relationships table to create friendships for any of the user's friends which have signed up to my site since their last visit. This would require looping through their friends list and attempting to find a user in my database, and if one exists, create a new relationship entry if there isn't one already. Is this the most efficient way of doing things? It seems pretty inefficient.
      singulars
    2. COYour goal seemed to be to have fewer records (per your 10,000,000 rows example in the question), so having as single row for each relationship vs. one record for each direction of the same relationship would half the number of rows (so would be 5,000,000). But, if you don't mind having possibly duplicate rows for each relationship, that would seem to be a better solution for your current needs.
      singulars
    3. COAlso, you are not (or at least shouldn't be) doing the looping in the controller to get each friend. Your query is and the database is (hopefully) good at that. To avoid n+1 queries like what you are alluding to, use the `:include` option to ensure when AR queries, it only queries once for each model type for a single lookup. You can optimize this even more, but in the beginning, you probably won't need to.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload