Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I once worked with a very large (Terabyte+) MySQL database. The largest table we had was literally over a billion rows. </p> <p>It worked. MySQL processed the data correctly most of the time. It was extremely unwieldy though. </p> <p>Just backing up and storing the data was a challenge. It would take days to restore the table if we needed to.</p> <p>We had numerous tables in the 10-100 million row range. Any significant joins to the tables were too time consuming and would take forever. So we wrote stored procedures to 'walk' the tables and process joins against ranges of 'id's. In this way we'd process the data 10-100,000 rows at a time (Join against id's 1-100,000 then 100,001-200,000, etc). This was significantly faster than joining against the entire table.</p> <p>Using indexes on very large tables that aren't based on the primary key is also much more difficult. Mysql stores indexes in two pieces -- it stores indexes (other than the primary index) as indexes to the primary key values. So indexed lookups are done in two parts: First MySQL goes to an index and pulls from it the primary key values that it needs to find, then it does a second lookup on the primary key index to find where those values are. </p> <p>The net of this is that for very large tables (1-200 Million plus rows) indexing against tables is more restrictive. You need fewer, simpler indexes. And doing even simple select statements that are not directly on an index may never come back. Where clauses <em>must</em> hit indexes or forget about it.</p> <p>But all that being said, things did actually work. We were able to use MySQL with these very large tables and do calculations and get answers that were correct.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload