Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I think you have a few terms mixed up here.</p> <p>All your data goes into one database (aka schema). In a database you can have tables.</p> <p>e.g.</p> <pre><code>table employee id integer name varchar address varchar country varchar table office id integer employee_id integer address varchar </code></pre> <p>Inside tables you have fields <code>(id, name, address)</code> aka columns. And tables have one or more rows.<br> An example for table employee:</p> <pre><code>id name address country ---------------------------------------------------- 1 John 1 Regent Street UK 2 James 24 Jump Street China 3 Darth Vader 1 Death Star Bestine, Tatooine </code></pre> <p>So much for the basics. </p> <p><strong>Why partitioning</strong><br> Now suppose that we have lots and lots of people (rows) in our database.<br> Remember this a galactic database, so we have 100 billion records.<br> If we want to search trough this fast it's nice if we can do this in parallel.<br> So we partition the table (say by country) and then we can have x servers looking in 1 country each.<br> Partitioning across servers is called <code>sharding</code>.</p> <p>Or we can partition e.g. historical data by year, so we don't have to go through <strong>all</strong> the data just to get the <strong>recent</strong> news. We only have to go through the partition for this year. This is called <code>partitioning</code>.</p> <p><strong>What's the big difference between <code>sharding</code> can just <code>partitioning</code>?</strong> </p> <p><strong>Sharding</strong><br> In <code>sharding</code> you anticipate that <strong>all</strong> your data is relevant, and equally likely to be queried. (e.g. google can expect all their data to be queried; archiving part of their data is useless for them).<br> In this case you want lots of machines to look though your data in parallel, where each machine does part of the work.<br> So you give each machine a different partition (shard) of the data and give all the machines the same query. When the results come out you <code>UNION</code> them all together and output the result. </p> <p><strong>Basic partitioning</strong><br> In basic <code>partitioning</code> part of your data is <code>hot</code> and part is <code>not</code>. A typical case is historical data, the new data is <code>hot</code>, the old data hardly gets touched.<br> For this use case it is pointless to put the old data in separate servers. Those machines will just wait and wait and do nothing because nobody cares about the old data except some auditors who look at it once a year.<br> So you partition that data by year and the server will automatically archive the old partitions so your queries will only look at one (maybe 2) years of data and be much faster.</p> <p><strong>Do I need partitioning?</strong><br> You only do partitioning when you have lots and lots of data, because it complicates your setup.<br> Unless you have more than a million records you don't have to consider partitioning.<sup>*)</sup><br> If you have more than a 100 million records, you should definitely consider it.<sup>*)</sup> </p> <p>For more info see: <a href="http://dev.mysql.com/doc/refman/5.1/en/partitioning.html" rel="noreferrer">http://dev.mysql.com/doc/refman/5.1/en/partitioning.html</a><br> and: <a href="http://blog.mayflower.de/archives/353-Is-MySQL-partitioning-useful-for-very-big-real-life-problems.html" rel="noreferrer">http://blog.mayflower.de/archives/353-Is-MySQL-partitioning-useful-for-very-big-real-life-problems.html</a><br> See also wiki: <a href="http://en.wikipedia.org/wiki/Partition_%28database%29" rel="noreferrer">http://en.wikipedia.org/wiki/Partition_%28database%29</a></p> <hr> <p><em><sup>*)</sup> These are just my personal heuristics YMMV.</em></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload