Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p><em>I've included my test schema and a script to produce test data at the end of this post. I have used the <a href="http://dev.mysql.com/doc/refman/5.1/en/query-cache-in-select.html" rel="noreferrer"><code>SQL_NO_CACHE</code></a> option to prevent MySQL from caching query results - this is just for testing and should ultimately be removed.</em></p> <p>This is a similar idea to that proposed by Donnie, but I have tidied it up a little. If I have understood the joins correctly, there is no need to repeat all the joins in each select, as each is effectively independent from the others. The original <code>WHERE</code> clause stipulates that <code>gallery.published</code> must be true and then follows with a series of 4 conditions joined by <code>OR</code>. Each query can therefore be executed separately. Here are the four joins:</p> <pre><code>gallery &lt;--&gt; gallery_to_name &lt;--&gt; name gallery &lt;--&gt; gallery_to_tag &lt;--&gt; tag gallery &lt;--&gt; site gallery &lt;--&gt; site &lt;--&gt; site_to_tag &lt;--&gt; tag </code></pre> <p>Because <code>gallery</code> contains <code>site_id</code>, in this case, there's no need for the intermediate join via the <code>site</code> table. The last join can therefore be reduced to this:</p> <pre><code>gallery &lt;--&gt; site_to_tag &lt;--&gt; tag </code></pre> <p>Running each <code>SELECT</code> separately, and using <code>UNION</code> to combine the results, is very fast. The results here assume the table structures <em>and indexes</em> shown at the end of this post:</p> <pre><code>SELECT SQL_NO_CACHE COUNT(id) AS matches FROM ( (SELECT g.id FROM gallery AS g INNER JOIN site AS s ON s.id = g.site_id WHERE g.published = TRUE AND s.name LIKE '3GRD%') UNION (SELECT g.id FROM gallery AS g INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id INNER JOIN name AS n ON n.id = g2n.name_id WHERE g.published = TRUE AND n.value LIKE '3GRD%') UNION (SELECT g.id FROM gallery AS g INNER JOIN gallery_to_tag AS g2t ON g2t.gallery_id = g.id INNER JOIN tag AS gt ON gt.id = g2t.tag_id WHERE g.published = TRUE AND gt.term = '3GRD') UNION (SELECT g.id FROM gallery AS g INNER JOIN site_to_tag AS s2t ON s2t.site_id = g.site_id INNER JOIN tag AS st ON st.id = s2t.tag_id WHERE g.published = TRUE AND st.term = '3GRD') ) AS totals; +---------+ | matches | +---------+ | 99 | +---------+ 1 row in set (0.00 sec) </code></pre> <p>The speed does vary depending on the search criteria. In the following example, a different search value is used for each table, and the LIKE operator has to do a little more work, as there are now more potential matches for each:</p> <pre><code>SELECT SQL_NO_CACHE COUNT(id) AS matches FROM ( (SELECT g.id FROM gallery AS g INNER JOIN site AS s ON s.id = g.site_id WHERE g.published = TRUE AND s.name LIKE '3H%') UNION (SELECT g.id FROM gallery AS g INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id INNER JOIN name AS n ON n.id = g2n.name_id WHERE g.published = TRUE AND n.value LIKE '3G%') UNION (SELECT g.id FROM gallery AS g INNER JOIN gallery_to_tag AS g2t ON g2t.gallery_id = g.id INNER JOIN tag AS gt ON gt.id = g2t.tag_id WHERE g.published = TRUE AND gt.term = '3IDP') UNION (SELECT g.id FROM gallery AS g INNER JOIN site_to_tag AS s2t ON s2t.site_id = g.site_id INNER JOIN tag AS st ON st.id = s2t.tag_id WHERE g.published = TRUE AND st.term = '3OJX') ) AS totals; +---------+ | matches | +---------+ | 12505 | +---------+ 1 row in set (0.24 sec) </code></pre> <p>These results compare favourably with the a query which uses multiple joins:</p> <pre><code>SELECT SQL_NO_CACHE COUNT(DISTINCT g.id) AS matches FROM gallery AS g INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id INNER JOIN name AS n ON n.id = g2n.name_id INNER JOIN gallery_to_tag AS g2t ON g2t.gallery_id = g.id INNER JOIN tag AS gt ON gt.id = g2t.tag_id INNER JOIN site AS s ON s.id = g.site_id INNER JOIN site_to_tag AS s2t ON s2t.site_id = s.id INNER JOIN tag AS st ON st.id = s2t.tag_id WHERE g.published = TRUE AND ( gt.term = '3GRD' OR st.term = '3GRD' OR n.value LIKE '3GRD%' OR s.name LIKE '3GRD%'); +---------+ | matches | +---------+ | 99 | +---------+ 1 row in set (2.62 sec) SELECT SQL_NO_CACHE COUNT(DISTINCT g.id) AS matches FROM gallery AS g INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id INNER JOIN name AS n ON n.id = g2n.name_id INNER JOIN gallery_to_tag AS g2t ON g2t.gallery_id = g.id INNER JOIN tag AS gt ON gt.id = g2t.tag_id INNER JOIN site AS s ON s.id = g.site_id INNER JOIN site_to_tag AS s2t ON s2t.site_id = s.id INNER JOIN tag AS st ON st.id = s2t.tag_id WHERE g.published = TRUE AND ( gt.term = '3IDP' OR st.term = '3OJX' OR n.value LIKE '3G%' OR s.name LIKE '3H%'); +---------+ | matches | +---------+ | 12505 | +---------+ 1 row in set (3.17 sec) </code></pre> <p><strong>SCHEMA</strong><br> The indexes on id columns plus <code>site.name</code>, <code>name.value</code> and <code>tag.term</code> are important:</p> <pre><code>DROP SCHEMA IF EXISTS `egervari`; CREATE SCHEMA IF NOT EXISTS `egervari`; USE `egervari`; -- ----------------------------------------------------- -- Table `site` -- ----------------------------------------------------- DROP TABLE IF EXISTS `site` ; CREATE TABLE IF NOT EXISTS `site` ( `id` INT UNSIGNED NOT NULL AUTO_INCREMENT , `name` VARCHAR(255) NOT NULL , INDEX `name` (`name` ASC) , PRIMARY KEY (`id`) ) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `gallery` -- ----------------------------------------------------- DROP TABLE IF EXISTS `gallery` ; CREATE TABLE IF NOT EXISTS `gallery` ( `id` INT UNSIGNED NOT NULL AUTO_INCREMENT , `site_id` INT UNSIGNED NOT NULL , `published` TINYINT(1) NOT NULL DEFAULT 0 , PRIMARY KEY (`id`) , INDEX `fk_gallery_site` (`site_id` ASC) , CONSTRAINT `fk_gallery_site` FOREIGN KEY (`site_id` ) REFERENCES `site` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `name` -- ----------------------------------------------------- DROP TABLE IF EXISTS `name` ; CREATE TABLE IF NOT EXISTS `name` ( `id` INT UNSIGNED NOT NULL AUTO_INCREMENT , `value` VARCHAR(255) NOT NULL , INDEX `value` (`value` ASC) , PRIMARY KEY (`id`) ) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `tag` -- ----------------------------------------------------- DROP TABLE IF EXISTS `tag` ; CREATE TABLE IF NOT EXISTS `tag` ( `id` INT UNSIGNED NOT NULL AUTO_INCREMENT , `term` VARCHAR(255) NOT NULL , INDEX `term` (`term` ASC) , PRIMARY KEY (`id`) ) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `gallery_to_name` -- ----------------------------------------------------- DROP TABLE IF EXISTS `gallery_to_name` ; CREATE TABLE IF NOT EXISTS `gallery_to_name` ( `gallery_id` INT UNSIGNED NOT NULL , `name_id` INT UNSIGNED NOT NULL , PRIMARY KEY (`gallery_id`, `name_id`) , INDEX `fk_gallery_to_name_gallery` (`gallery_id` ASC) , INDEX `fk_gallery_to_name_name` (`name_id` ASC) , CONSTRAINT `fk_gallery_to_name_gallery` FOREIGN KEY (`gallery_id` ) REFERENCES `gallery` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `fk_gallery_to_name_name` FOREIGN KEY (`name_id` ) REFERENCES `name` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `gallery_to_tag` -- ----------------------------------------------------- DROP TABLE IF EXISTS `gallery_to_tag` ; CREATE TABLE IF NOT EXISTS `gallery_to_tag` ( `gallery_id` INT UNSIGNED NOT NULL , `tag_id` INT UNSIGNED NOT NULL , PRIMARY KEY (`gallery_id`, `tag_id`) , INDEX `fk_gallery_to_tag_gallery` (`gallery_id` ASC) , INDEX `fk_gallery_to_tag_tag` (`tag_id` ASC) , CONSTRAINT `fk_gallery_to_tag_gallery` FOREIGN KEY (`gallery_id` ) REFERENCES `gallery` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `fk_gallery_to_tag_tag` FOREIGN KEY (`tag_id` ) REFERENCES `tag` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE) ENGINE = InnoDB; -- ----------------------------------------------------- -- Table `site_to_tag` -- ----------------------------------------------------- DROP TABLE IF EXISTS `site_to_tag` ; CREATE TABLE IF NOT EXISTS `site_to_tag` ( `site_id` INT UNSIGNED NOT NULL , `tag_id` INT UNSIGNED NOT NULL , PRIMARY KEY (`site_id`, `tag_id`) , INDEX `fk_site_to_tag_site` (`site_id` ASC) , INDEX `fk_site_to_tag_tag` (`tag_id` ASC) , CONSTRAINT `fk_site_to_tag_site` FOREIGN KEY (`site_id` ) REFERENCES `site` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `fk_site_to_tag_tag` FOREIGN KEY (`tag_id` ) REFERENCES `tag` (`id` ) ON DELETE CASCADE ON UPDATE CASCADE) ENGINE = InnoDB; </code></pre> <p><strong>TEST DATA</strong><br> This populates <code>site</code> with 900 rows, <code>tag</code> with 3560 rows, <code>name</code> with 800 rows and <code>gallery</code> with 40,000 rows, and inserts entries into the link tables:</p> <pre><code>DELIMITER // DROP PROCEDURE IF EXISTS populate// CREATE PROCEDURE populate() BEGIN DECLARE i INT DEFAULT 0; WHILE i &lt; 900 DO INSERT INTO site (name) VALUES (CONV(i + 1 * 10000, 20, 36)); SET i = i + 1; END WHILE; SET i = 0; WHILE i &lt; 3560 DO INSERT INTO tag (term) VALUES (CONV(i + 1 * 10000, 20, 36)); INSERT INTO site_to_tag (site_id, tag_id) VALUES ( (i MOD 900) + 1, i + 1 ); SET i = i + 1; END WHILE; SET i = 0; WHILE i &lt; 800 DO INSERT INTO name (value) VALUES (CONV(i + 1 * 10000, 20, 36)); SET i = i + 1; END WHILE; SET i = 0; WHILE i &lt; 40000 DO INSERT INTO gallery (site_id, published) VALUES ( (i MOD 900) + 1, i MOD 2 ); INSERT INTO gallery_to_name (gallery_id, name_id) VALUES ( i + 1, (i MOD 800) + 1 ); INSERT INTO gallery_to_tag (gallery_id, tag_id) VALUES ( i + 1, (i MOD 3560) + 1 ); SET i = i + 1; END WHILE; END; // DELIMITER ; CALL populate(); </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload