Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing MySQL query using GROUP BY on time functions
    primarykey
    data
    text
    <p>I have the following query:</p> <pre><code>SELECT location, step, COUNT(*), AVG(foo), YEAR(start), MONTH(start), DAY(start) FROM table WHERE jobid = 'xxx' AND start BETWEEEN '2010-01-01' AND '2010-01-08' GROUP BY location, step, YEAR(start), MONTH(start), DAY(start) </code></pre> <p>Originally I had indexes on individual columns, such as <em>jobid</em> and <em>start</em>, but quickly realized that MySQL only really honors one index per table in a select. As such, it would use the <em>jobid</em> index and then do a pretty large scan to filter out by the <em>start</em> range.</p> <p>Adding an index on (<em>jobid</em>, <em>start</em>) helped quite a bit, but the GROUP BY is still causing performance issues. I've read the <a href="http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html" rel="nofollow">docs on GROUP BY optimizations</a> and understand that in order to benefit from these optimizations I need an index that contains (<em>location</em>, <em>step</em>, <em>start</em>), but I still have two open questions:</p> <ol> <li><p>Will the group by optimizations even work with the time functions (YEAR, MONTH, DAY, etc)? Or am I going to have to store these values as separate columns? The reason I like doing the functions is that it means I can control the time zone on a per-connection basis and get back results tailored to the end-users time zone. If I have to pre-store the year, month, and day, I'll do it via UTC and then all my users will just get reports in UTC.</p></li> <li><p>Even if I can solve issue #1, can I even do this? The index (<em>jobid</em>, <em>start</em>) helped with the WHERE clause, but the GROUP BY needs a different index to be optimized (<em>location</em>, <em>step</em>, <em>start</em>) or, depending on the answer to #1, (<em>location</em>, <em>step</em>, <em>year</em>, <em>month</em>, <em>day</em>). But the problem is that those two indexes don't share a common left-hand set of columns, so I don't believe my WHERE and GROUP by can be compatible such that the same index gets used. So my question is: am I just hosed here?</p></li> </ol> <p>Any other thoughts on how to achieve this would be helpful. And, just to preempt a few questions/comments that might come up:</p> <ol> <li>Yes, this is a time-series data set.</li> <li>Yes, it would benefit from something like <a href="http://www.mrtg.org/rrdtool/" rel="nofollow">RRDtool</a>, but doing so would cause me to loose doing timezone-specific results.</li> <li>Yes, pre-calculating rollups would probably be a good idea, but I don't need <em>awesome</em> performance and so I'm OK with <em>good</em> performance if it lets me customize the results for each user's timezone.</li> </ol> <p>With the above said, if anyone has any design suggestions on how to do something like rollups or round-robin databases and still get timezone-specific results, I'm all ears!</p> <hr> <p><em>Update</em>: as requested, here is some more info:</p> <p>show indexes from output:</p> <pre> step 0 PRIMARY 1 step_id A 16 NULL NULL BTREE step 1 start 1 start A 16 NULL NULL BTREE step 1 step 1 step A 2 NULL NULL BTREE step 1 foo 1 foo A 16 NULL NULL YES BTREE step 1 location 1 location A 2 NULL NULL YES BTREE step 1 jobid 1 jobid A 2 NULL NULL YES BTREE </pre> <p>show create table output:</p> <pre> CREATE TABLE `step` ( `start` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', `step` smallint(2) unsigned NOT NULL, `step_id` int(8) unsigned NOT NULL AUTO_INCREMENT, `location` varchar(12) DEFAULT NULL, `jobid` varchar(37) DEFAULT NULL, PRIMARY KEY (`step_id`), KEY `start_time` (`start`), KEY `step` (`step`), KEY `location` (`location`), KEY `job_id` (`jobid`) ) ENGINE=InnoDB AUTO_INCREMENT=240 DEFAULT CHARSET=utf8 </pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload