Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>From your description above, I'm assuming that your 5,000 queries per second are entirely read operations. This is essentially what we'd call a data warehouse use case. What are your availability requirements? Does it have to be hosted on AWS and friends, or can you buy your own hardware to run in-house? What does your data look like? What does the logic which consumes this data look like? </p> <p>You might get the sense that there really isn't enough information here to answer the question definitively, but I can at least offer some advice.</p> <p>First, if your data is relatively small and your queries are simple, save yourself some hassle and make sure you're querying from RAM instead of disk. Any modern RDBMS with support for in-memory caching/tablespaces will do the trick. Postgres and MySQL both have features for this. In the case of Postgres make sure you've tuned the memory parameters appropriately as the out-of-the-box configuration is designed to run on pretty meager hardware. If you <em>must</em> use a NoSQL option, depending on the structure of your data Redis is probably a good choice (it's also primarily in-memory). However in order to say which flavor of NoSQL might be the best fit we'd need to know more about the structure of the data that you're querying, and what queries you're running.</p> <p>If the queries boil down to <code>SELECT * FROM table WHERE primary_key = {CONSTANT}</code> - don't bother messing with NoSQL - just use an RDBMS and learn how to tune the dang thing. This is doubly true if you can run it on your own hardware. If the connection count is high, use read slaves to balance the load.</p> <p><strong>Long-after-the-fact Edit (5/7/2013)</strong>: Something I should've mentioned before: EC2 is a really really crappy place to measure performance of self-managed database nodes. Unless you're paying out the nose, your I/O perf will be <em>terrible</em>. Your choices are to either pay big money for provisioned IOPS, RAID together a bunch of EBS volumes, or rely on ephemeral storage whilst syncing a WAL off to S3 or similar. All of these options are expensive and difficult to maintain. All of these options have varying degrees of performance.</p> <p>I discovered this for a recent project, so I switched to Rackspace. The performance increased tremendously there, but I noticed that I was paying a lot for CPU and RAM resources when really I just need fast I/O. Now I host with Digital Ocean. All of DO's storage is SSD. Their CPU performance is kind of crappy in comparison to other offerings, but I'm incredibly I/O bound so I Just Don't Care. After dropping Postgres' <code>random_page_cost</code> to 2, I'm humming along quite nicely.</p> <p>Moral of the story: profile, tune, repeat. Ask yourself what-if questions and constantly validate your assumptions.</p> <p><strong>Another long-after-the-fact-edit (11/23/2013)</strong>: As an example of what I'm describing here, check out the following article for an example of using MySQL 5.7 with the InnoDB memcached plugin to achieve 1M QPS: <a href="http://dimitrik.free.fr/blog/archives/11-01-2013_11-30-2013.html#2013-11-22" rel="noreferrer">http://dimitrik.free.fr/blog/archives/11-01-2013_11-30-2013.html#2013-11-22</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload