Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p><strong>Summary</strong></p> <p>The problem is that <code>field</code> is not a good candidate for indexing, due to the nature of <a href="http://en.wikipedia.org/wiki/Btree" rel="noreferrer">b-trees</a>.</p> <p><strong>Explanation</strong></p> <p>Let's suppose you have a table that has the results of 500,000 coin tosses, where the toss is either <code>1</code> (heads) or <code>0</code> (tails):</p> <pre><code>CREATE TABLE toss ( id int NOT NULL AUTO_INCREMENT, result int NOT NULL DEFAULT '0', PRIMARY KEY ( id ) ) select result, count(*) from toss group by result order by result; +--------+----------+ | result | count(*) | +--------+----------+ | 0 | 250290 | | 1 | 249710 | +--------+----------+ 2 rows in set (0.40 sec) </code></pre> <p>If you want to select one toss (at random) where the toss was tails, then you need to search through your table, picking a random starting place.</p> <pre><code>select * from toss where result != 1 limit 123456, 1; +--------+--------+ | id | result | +--------+--------+ | 246700 | 0 | +--------+--------+ 1 row in set (0.06 sec) explain select * from toss where result != 1 limit 123456, 1; +----+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | toss | ALL | NULL | NULL | NULL | NULL | 500000 | Using where | +----+-------------+-------+------+---------------+------+---------+------+--------+-------------+ </code></pre> <p>You see that you're basically searching sequentially through all of the rows to find a match.</p> <p>If you create an index on the <code>toss</code> field, then your index will contain two values, each with roughly 250,000 entries.</p> <pre><code>create index foo on toss ( result ); Query OK, 500000 rows affected (2.48 sec) Records: 500000 Duplicates: 0 Warnings: 0 select * from toss where result != 1 limit 123456, 1; +--------+--------+ | id | result | +--------+--------+ | 246700 | 0 | +--------+--------+ 1 row in set (0.25 sec) explain select * from toss where result != 1 limit 123456, 1; +----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | toss | range | foo | foo | 4 | NULL | 154565 | Using where | +----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ </code></pre> <p>Now you're searching fewer records, but the time to search increased from 0.06 to 0.25 seconds. Why? Because sequentially scanning an index is actually less efficient than sequentially scanning a table, for indexes with a large number of rows for a given key.</p> <p>Let's look at the indexes on this table:</p> <pre><code>show index from toss; +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | toss | 0 | PRIMARY | 1 | id | A | 500000 | NULL | NULL | | BTREE | | | toss | 1 | foo | 1 | result | A | 2 | NULL | NULL | | BTREE | | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ </code></pre> <p>The PRIMARY index is a good index: there are 500,000 rows, and there are 500,000 values. Arranged in a BTREE, you can quickly identify a single row based on the id.</p> <p>The foo index is a bad index: there are 500,000 rows, but only 2 possible values. This is pretty much the worst possible case for a BTREE -- all of the overhead of searching the index, and still having to search through the results.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload