Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL: Indexing Table With 100+ Million Rows
    text
    copied!<p>I've found myself in a bit of a predicament. I have a table used for page hit tracking with nearly 105 million rows.(!) It looks like this:</p> <pre><code>CREATE TABLE `media_hits` ( `id` int(10) unsigned NOT NULL auto_increment, `media_code` char(7) NOT NULL, `day` date NOT NULL, `hits` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), UNIQUE KEY `media_code` (`media_code`,`day`) ) ENGINE=InnoDB; </code></pre> <p>As you can imagine running any kind of query on this table takes a long time. A typical query would be the following:</p> <pre><code>SELECT DISTINCT(`media_code`), COUNT(*) AS c FROM `media_hits` WHERE `day` &gt;= DATE_SUB(NOW(), INTERVAL 1 DAY) GROUP BY(`media_code`) ORDER BY c DESC LIMIT 200; </code></pre> <p>This query takes forever. And EXPLAIN on the query gives me this:</p> <pre><code> id: 1 select_type: SIMPLE table: media_hits type: index possible_keys: NULL key: media_code key_len: 10 ref: NULL rows: 104773158 Extra: Using where; Using index; Using temporary; Using filesort </code></pre> <p>That's just plain awful. So my question is: What can I do about this? Trying to add proper indexes now is impossible. The ALTER TABLE query would probably take over a week to run. I tried deleting rows older than 6 months, but 24 hours later that query was still running.</p> <p>I need to fix this some how. The only thing that crosses my mind is creating a new table with proper indexes, and start recording hits in that table. In the background I could have a script slowly inserting records from the old media_hits table. Can anyone offer suggestions on how to index this table, and possibly some hints on which columns I should index?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload