StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PODistributed and replicated data storage for small amounts of data under Windows
primarykey
Id
6022444
data
AcceptedAnswerId
6027590
AnswerCount
2
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2011-05-16T19:39:23.113
FavoriteCount
1
LastActivityDate
2011-06-30T14:28:17.427
LastEditDate
LastEditorUserId
0
OwnerUserId
119549
ParentId
0
PostTypeId
1
Score
0
ViewCount
501
LastEditorDisplayName
text
Body
<p>We're looking for a good solution to a caching problem. We'd like to distribute a relatively small amount of data (perhaps 10's of GBs) among a cluster of web servers such that:</p> <ol> <li>The data is replicated to all nodes</li> <li>The data is persistent</li> <li>The data can be accessed locally</li> </ol> <p>Our motivation for a caching solution is that we currently have a single point of failure: a SQL Server database. We're unable to set up a fail-over cluster for this database, unfortunately. We're already using Memcached to a large extent, but we want to avoid the problem where if a Memcached node goes down, we'd suddenly have a large amount of cache misses and therefore experience a massive amount of requests to one endpoint. </p> <p>We'd prefer instead to have local persistent caches on each web server node so that the resulting load would be distributed. When a retrieval is made, it would pass through the following:</p> <ol> <li>Check for data in Memcached. If it's not there...</li> <li>Check for data in local persistent storage. If it's not there...</li> <li>Retrieve data from the database.</li> </ol> <p>When data changes, the cache key is invalidated at both caching layers.</p> <p>We've been looking at a bunch of potential solutions, but none of them seem to match exactly what we need:</p> <h2>CouchDB</h2> <p>This is pretty close; the data model we'd like to cache is very document-oriented. However, its replication model isn't exactly what we're looking for. It seems to me as though replication is an <em>action</em> you have to perform rather than a permanent <em>relationship among nodes</em>. You can set up continuous replication, but this doesn't persist between restarts.</p> <h2>Cassandra</h2> <p>This solution seems to be mostly geared toward those with large storage requirements. We have a large amount of users, but small amounts of data. Cassandra looks to be able to support <em>n</em> number of <em>fail-over nodes</em>, but 100% replication among nodes doesn't seem to be what it's intended for; instead, it seems more geared toward distribution only.</p> <h2>SAN</h2> <p>One attractive idea is that we can store a bunch of files on a SAN or similar type of appliance. I haven't worked with these before, but it seems like this would still be a single point of failure; if the SAN goes down, we'd suddenly be going to the database for all cache misses.</p> <h2>DFS Replication</h2> <p>A simple Google search revealed this. It seems to do what we want; it synchronizes files across all nodes in a replication cluster. But the marketing text makes it look like it's more of a system for ensuring documents are copied to different office locations. Also, it has limits, like a file count maximum, that wouldn't work well for us.</p> <p>Have any of you had similar requirements to ours and found a good solution that meets your needs?</p>
Tags
<caching><architecture><replication><distributed-caching>
Title
Distributed and replicated data storage for small amounts of data under Windows
singulars
PostAcceptedAnswerId
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USJacob
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
2. PO
  singulars
  PostTypePostTypeId
  PTAnswer
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  PODistributed and replicated data storage for small amounts of data under Windows
  UserUserId
  USАндрей Татаринов
  VoteTypeVoteTypeId
  VTFavorite
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.