Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL count(*) , Group BY and INNER JOIN
    text
    copied!<p>I have a really bad time with a query on MySQL 5.1. I simplified the 2 tables I make a JOIN on : </p> <pre><code>CREATE TABLE `jobs` ( `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY , `title` VARCHAR( 255 ) NOT NULL ) ENGINE = MYISAM ; </code></pre> <p>AND </p> <pre><code>CREATE TABLE `jobsCategories` ( `jobID` int(11) NOT NULL, `industryID` int(11) NOT NULL, KEY `jobID` (`jobID`), KEY `industryID` (`industryID`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 </code></pre> <p>The query is straight forward : </p> <pre><code>SELECT count(*) as nb,industryID FROM jobs J INNER JOIN jobsCategories C ON C.jobID=J.id GROUP BY industryID ORDER BY nb DESC; </code></pre> <p>I got around 150000 records into the jobs table, and 350000 records into the jobsCategories table, and I have 30 industries;</p> <p>The query takes approximatively 50 seconds to execute !!!</p> <p>Do you have any idea why it takes so long? How could I optimize the structure of this database? Profilling the query show me that 99% of the execution time is spend on copying on tmp tables.</p> <pre><code>EXPLAIN &lt;query&gt; gives me : *************************** 1. row *************************** id: 1 select_type: SIMPLE table: J type: index possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 178950 Extra: Using index; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: C type: ref possible_keys: jobID key: jobID key_len: 8 ref: J.id rows: 1 Extra: Using where 2 rows in set (0.00 sec) </code></pre> <p>About the memory : </p> <pre><code>free -m : total used free shared buffers cached Mem: 2011 1516 494 0 8 1075 -/+ buffers/cache: 433 1578 Swap: 5898 126 5772 </code></pre> <p>With the FORCE INDEX suggested below</p> <pre><code>select count(*) as nb, industryID from jobs J inner join jobsCategories C force index (industryID) on (C.jobID = J.id ) group by industryID order by nb DESC; SHOW PROFILE; </code></pre> <p>gives me : </p> <pre><code>+----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000095 | | Opening tables | 0.000014 | | System lock | 0.000008 | | Table lock | 0.000007 | | init | 0.000032 | | optimizing | 0.000011 | | statistics | 0.000032 | | preparing | 0.000016 | | Creating tmp table | 0.000031 | | executing | 0.000003 | | Copying to tmp table | 3.301305 | | Sorting result | 0.000028 | | Sending data | 0.000024 | | end | 0.000003 | | removing tmp table | 0.000009 | | end | 0.000004 | | query end | 0.000003 | | freeing items | 0.000029 | | logging slow query | 0.000003 | | cleaning up | 0.000003 | +----------------------+----------+ </code></pre> <p>I guess my RAM (2Gb) is not large enough. How can I be certain this is the case?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload