Note that there are some explanatory texts on larger screens.

plurals
  1. POPlease help me with table/nested JOIN optimization: Way too slow!
    text
    copied! <p>I have about 250 tables in my database, each with exactly 439340 rows. </p> <pre><code>mysql&gt; SHOW CREATE TABLE data.b50d1 ; +-------+-------------------------------------------------------------------------------------------- CREATE TABLE `b50d1` ( `pTime` int(10) unsigned NOT NULL, `Slope` double NOT NULL, `STD` double NOT NULL, PRIMARY KEY (`pTime`), KEY `Slope` (`Slope`) USING BTREE ) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=43940 MAX_ROWS=43940 PACK_KEYS=1 ROW_FORMAT=FIXED | +-------+-------------------------------------------------------------------------------------------- </code></pre> <p>As you can see, there are three columns in each table:</p> <ul> <li><strong>pTime</strong>: POSIX timestamp. This column (and all of its values) is exactly the same in each table. This is my <code>PRIMARY KEY</code></li> <li><strong>Slope</strong></li> <li><strong>STD</strong></li> </ul> <p>The columns Slope and STD have 'signed double' values that differ from row-to-row and from table-to-table.</p> <p>Here is a small sample from one of the tables:</p> <pre><code>mysql&gt; SELECT * FROM data.b50d1 limit 10; +------------+------------+-------------+ | pTime | Slope | STD | +------------+------------+-------------+ | 1104537600 | 6.38733032 | -1.13387667 | | 1104537900 | 5.58733032 | -0.93810617 | | 1104538200 | 5.30135747 | -0.51912757 | | 1104538500 | 5.4678733 | -0.54460575 | | 1104538800 | 5.58190045 | -0.46369055 | | 1104539100 | 5.50226244 | -0.46712018 | | 1104714000 | 5.31221719 | -0.25210485 | | 1104714300 | 4.72941176 | 0.00321249 | | 1104714600 | 5.19638009 | 0.64116376 | | 1104714900 | 5.12941176 | 0.39599099 | +------------+------------+-------------+ </code></pre> <p>Using these tables I run a stored procedure. This procedure consists of the following steps:</p> <p>STEP 1) <code>CREATE TEMPORARY TABLE</code> MainList... </p> <p>STEP 2) <code>INSERT</code> the results of a <code>SELECT</code> statement into the table. The resulting dataset is a filtered composite of the original tables. </p> <p>STEP 3) <code>SELECT</code> statement with nested <code>JOIN</code>s iterates through each MainList.STD value of the <code>TEMPORARY</code> table (MainList), and returns the first row <em>from one of the original tables</em> that matches certain specified conditions (example below). </p> <p>STEP 4) <code>JOIN</code> the results to MainList and output them to the user.</p> <p>The following is the procedure itself:</p> <pre><code>DELIMITER $$ CREATE DEFINER=`root`@`localhost` PROCEDURE `GetTimeList`(t1 varchar(7),t2 varchar(7),t3 varchar(7),inp1 float,inp2 float,inp3 float,inp4 float,inp5 float,inp6 float,inp7 float,inp8 float,inp9 float,inp10 float) READS SQL DATA BEGIN DROP TABLE IF EXISTS MainList; CREATE TEMPORARY TABLE MainList( `pTime` int unsigned NOT NULL, `STD` double NOT NULL, PRIMARY KEY (`pTime`), KEY (`STD`) USING BTREE ) ENGINE = MEMORY; SET @s = CONCAT('INSERT INTO MainList(pTime,STD) SELECT DISTINCT t1.pTime, t1.STD FROM ',t1,' AS t1 JOIN (',t2,' as t2 ,',t3,' as t3 )', ' ON (( t1.Slope &gt;= ', inp1, ' AND t1.Slope &lt;= ', inp2, ' AND t1.STD &gt;= ', inp3, ' AND t1.STD &lt;= ', inp4, ' AND t2.Slope &gt;= ', inp5, ' AND t2.Slope &lt;= ', inp6, ' AND t3.Slope &gt;= ', inp7, ' AND t3.Slope &lt;= ', inp8, ' ) OR ( t1.Slope &lt;= ', 0-inp1, ' AND t1.Slope &gt;= ', 0-inp2, ' AND t1.STD &lt;= ', 0-inp3, ' AND t1.STD &gt;= ', 0-inp4, ' AND t2.Slope &lt;= ', 0-inp5, ' AND t2.Slope &gt;= ', 0-inp6, ' AND t3.Slope &lt;= ', 0-inp7, ' AND t3.Slope &gt;= ', 0-inp8, ' ) ) AND ((t1.Slope &lt; 0 XOR t1.STD &lt; 0) AND t1.pTime = t2.pTime AND t2.pTime = t3.pTime AND t1.pTime &gt;= ', inp9, ' AND t1.pTime &lt;= ', inp10,' ) ORDER BY t1.pTime' ); PREPARE stmt FROM @s; EXECUTE stmt; SET @q= CONCAT('SELECT m.pTime as OpenTime, CASE WHEN m.STD &lt; 0 THEN 1 ELSE -1 END As Type, mu.pTime As CloseTime from MainList m LEFT JOIN ',t1,' mu ON mu.pTime = ( SELECT DISTINCT md.pTime FROM ',t1,' md WHERE md.pTime&gt;m.pTime',' AND md.pTime &lt;= ', inp10, ' AND SIGN (md.STD)!= SIGN (m.STD) AND ABS(md.STD) &gt;= ABS(m.STD) ORDER BY md.pTime LIMIT 1 )'); PREPARE stmt FROM @q; EXECUTE stmt; DEALLOCATE PREPARE stmt; DROP TABLE MainList; END </code></pre> <p>For ease of testing I have broken up the above procedure into two individual queries. Here are those queries accompanied by '<strong>EXPLAIN EXTENDED</strong>' statements (the temp table was generated beforehand) :</p> <h2><strong>FIRST QUERY</strong></h2> <hr> <pre><code>INSERT INTO MainList(pTime,STD) SELECT t1.pTime, t1.STD FROM b50d1 AS t1 JOIN(b75d1 AS t2, b100d1 AS t3)ON( ( t1.Slope &gt;= 2.3169 AND t1.Slope &lt;= 7.0031 AND t1.STD &gt;= - 2.068 AND t1.STD &lt;= - 0.972 AND t2.Slope &gt;= 0.3179 AND t2.Slope &lt;= 5.7221 AND t3.Slope &gt;= 2.6466 AND t3.Slope &lt;= 35.7534 ) OR( t1.Slope &lt;= - 2.3169 AND t1.Slope &gt;= - 7.0031 AND t1.STD &lt;= 2.068 AND t1.STD &gt;= 0.972 AND t2.Slope &lt;= - 0.3179 AND t2.Slope &gt;= - 5.7221 AND t3.Slope &lt;= - 2.6466 AND t3.Slope &gt;= - 35.7534 ) ) AND( (t1.Slope &lt; 0 XOR t1.STD &lt; 0) AND t1.pTime = t2.pTime AND t2.pTime = t3.pTime AND t1.pTime &gt;= 1104710000 AND t1.pTime &lt;= 1367700000 ) ORDER BY t1.pTime; </code></pre> <p><strong>EXPLAIN EXTENDED:</strong> </p> <pre><code>+----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+ | 1 | SIMPLE | t1 | ALL | PRIMARY,Slope | NULL | NULL | NULL | 439340 | 25.79 | Using where; Using filesort | | 1 | SIMPLE | t2 | eq_ref | PRIMARY,Slope | PRIMARY | 4 | data.t1.pTime | 1 | 100.00 | Using where | | 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | data.t1.pTime | 1 | 100.00 | Using where | +----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+ </code></pre> <h2><strong>SECOND QUERY</strong></h2> <hr> <pre><code>SELECT m.pTime AS OpenTime, CASE WHEN m.STD &lt; 0 THEN 1 ELSE - 1 END AS Type, mu.pTime AS CloseTime; FROM MainList m LEFT JOIN b50d1 mu ON mu.pTime =( SELECT DISTINCT md.pTime FROM b50d1 md WHERE md.pTime &gt; m.pTime AND md.pTime &lt;= 1367700000 AND SIGN(md.STD)!= SIGN(m.STD) AND ABS(md.STD)&gt;= ABS(m.STD) ORDER BY md.pTime LIMIT 1 ); </code></pre> <p><strong>EXPLAIN EXTENDED:</strong> </p> <pre><code>+----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | m | ALL | NULL | NULL | NULL | NULL | 16 | 100.00 | | | 1 | PRIMARY | mu | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using index | | 2 | DEPENDENT SUBQUERY | md | range | PRIMARY | PRIMARY | 4 | NULL | 439338 | 100.00 | Using where | +----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+ </code></pre> <p>The query works and returns the right results, but it is <em>orders of magnitude</em> slower than what I need. I recognize that the <code>type: ALL</code> statement present in both <code>EXPLAIN</code> statements indicates that my indices are likely suboptimal. </p> <p>I have only used MYSQL for the past week, and I am starting to feel like I am in over my head. I would really appreciate some help.</p> <p>I made a SQL file with <code>CREATE TABLE</code> and <code>INSERT</code> statements so that anyone kind enough to try to help me could create smaller versions of my tables in the"Test" database: <a href="http://www.furlender.com/slowtables.SQL" rel="nofollow">slowtables.SQL</a></p> <p>For completeness here is the my.ini settings file - perhaps it is a bottleneck?</p> <pre><code>[client] pipe socket=mysql [mysql] default-character-set=latin1 [mysqld] skip-networking enable-named-pipe socket=mysql basedir="C:/Program Files/MySQL/MySQL Server 5.5/" datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/" character-set-server=latin1 default-storage-engine=MYISAM sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" max_connections=100 query_cache_size=189M table_cache=256 tmp_table_size=192M key_buffer_size=594M read_buffer_size=64K read_rnd_buffer_size=256K sort_buffer_size=256K </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload