Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL Partitioning Join Table (Dual PK)
    text
    copied!<p>I have two tables that are joined together via a third table. The first table (images) contains over a million records. The second table (tags) contains around 100K records. The third table is a dual primary key table consisting of image_id &amp; tag_id, and contains around 4 million records.</p> <p>I've been trying to determine a way to speed up searching for images by tag. After reading a dozen records or so, it sounds like my best bet is to partition my join table. Unfortunately, I've never done that before, and am terrified of potentially junking the largest table in the system.</p> <p>Anyway, since the table is a dual primary key it's indexed, but it's still taking quite a bit of time to query. I'm not for sure if it's strictly the speed of that table, or if it's the speed of the images table I'm joining to. Either way, I'm hitting 2-3s query times as of right now and am worried that I'm going to bog down even further with more images, more tags, and more users.</p> <p>All lookups are done using INT(11) primary keys. I'm never text searching, nor looking at varchar fields. One query does look at a <code>status</code> CHAR(1) column. Granted, status isn't an index, but should it be? I originally thought about ENUM with A,I,X,D as options, but I've expanded up to 6 status types since then.</p> <p>Hardware wise, I'm on a core i-7 extreme with 24GB DDR3, and MySQL exists on a 120GB SSD (only 3gbps).</p> <p>What can I do to speed things up? Is the bottleneck disk io, query efficiency, poor memory utalization?</p> <p>As a follow up note, I've been watching the server activity via MySQL Workbench and I've found 2 interesting things:</p> <ol> <li>Key efficiency is at 99.99%, while query cache hit rate is at 0.28%.</li> <li>I often see the cron job that re-calculates counts have a state of "Copying to tmp table". </li> </ol> <p>The query is:</p> <pre class="lang-sql prettyprint-override"><code>SELECT COUNT(t.image_id) FROM Image_Tags t INNER JOIN (Images i) ON (i.image_id = t.image_id) WHERE t.tag_id = :tid AND i.status = 'A' </code></pre> <p>I'm starting to think I need to throw an index on that status column.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload