Note that there are some explanatory texts on larger screens.

plurals
  1. POSpeeding up row counting in MySQL
    text
    copied!<p>Suppose, for illustrative purposes, you are running a library using a simple MySQL "books" table with three columns: </p> <p>(id, title, status)</p> <ul> <li><strong>id</strong> is the primary key</li> <li><strong>title</strong> is the title of the book</li> <li><strong>status</strong> could be an enum describing the book's current state (e.g. AVAILABLE, CHECKEDOUT, PROCESSING, MISSING)</li> </ul> <p>A simple query to report how many books fall into each state is:</p> <pre><code>SELECT status, COUNT(*) FROM books GROUP BY status </code></pre> <p>or to specifically find how many books are available:</p> <pre><code>SELECT COUNT(*) FROM books WHERE status = "AVAILABLE" </code></pre> <p>However, once the table grows to millions of rows, these queries take several seconds to complete. Adding an index to the "status" column doesn't appear to make a difference in my experience.</p> <p>Aside from periodically caching the results or explicitly updating summary info in a separate table each time a book changes state (via triggers or some other mechanism), are there any techniques for speeding up these kinds of queries? It seems that the COUNT queries end up looking at every row, and (without knowing more details) I'm a bit surprised that this information can't somehow be determined from the index.</p> <p><strong>UPDATE</strong></p> <p>Using the sample table (with an indexed "status" column) with 2 million rows, I benchmarked the GROUP BY query. Using the InnoDB storage engine, the query takes 3.0 - 3.2 seconds on my machine. Using MyISAM, the query takes 0.9 - 1.1 seconds. There was no significant difference between count(*), count(status), or count(1) in either case.</p> <p>MyISAM is admittedly a bit faster, but I was curious to see if there was a way to make an equivalent query run <em>much</em> faster (e.g. 10-50 ms -- fast enough to be called on every webpage request for a low-traffic site) without the mental overhead of caching and triggers. It sounds like the answer is "there's no way to run the direct query quickly" which is what I expected - I just wanted to make sure I wasn't missing an easy alternative.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload