StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
20150977
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2013-11-22T17:31:08.763
FavoriteCount
0
LastActivityDate
2013-11-22T17:42:24.870
LastEditDate
2013-11-22T17:42:24.870
LastEditorUserId
417194
OwnerUserId
417194
ParentId
19901359
PostTypeId
2
Score
1
ViewCount
0
LastEditorDisplayName
text
Body
Starting with the more pressing issues, before discussing options. <hr> The first immediate problem is: <pre><code>SELECT DISTINCT … </code></pre> A <code>select distinct</code> is slow. Very, very slow: it basically compares each field of each row returned by your set. There naturally is room for optimization when there's an <code>ID</code> in there that is guaranteed to be unique per row, but your own query doesn't look like it offers any such possibility: at best a tuple from <code>matches_of_comments</code> and <code>new_iddaa</code>. To work around this, break the query in two or more parts, and only fetch what is actually needed for what you're doing. This seems to be ordering <code>matches_of_comments</code> by their latest comment date, and then fetching extra cosmetic data from <code>users</code> and <code>new_iddaa</code>. The next one is the biggest problem imho: <pre><code>INNER JOIN ( SELECT match_static_id, MAX( TIMESTAMP ) maxtimestamp, COUNT( match_static_id ) AS comments_no FROM comments GROUP BY match_static_id ) AS f2 ON f1.match_static_id = f2.match_static_id AND f1.timestamp = f2.maxtimestamp </code></pre> You're joining an aggregate with a table on a <code>(match_static_id, timestamp)</code> tuple that has no index on it, and fetching a huge set at that. You've a guaranteed merge join down the road — not what you want. The last eye popping issue is: <pre><code>ORDER BY f2.maxtimestamp DESC </code></pre> First off, you've no limit there. This means you're going to build, sort and return an enormous set. Surely you're paginating this data, so do so in the query by adding a limit clause. Once you do add a limit, you need to consider what is adding extra rows, and how they should be ordered. Based on your schema, I imagine <code>new_iddaa</code> does. Are you paginating things in such a way that the latter information needs to be part of that query and the number of rows it returns? I imagine not, since you're not evidently interested in how these rows are sorted. After scanning your schema, this additional one pops out: <pre><code>`match_id` varchar(255) </code></pre> The rows that reference this are integers, right? So it should be an integer as well, to avoid the overhead of casting varchars to ints or vice-versa, and to allow the use of indexes in either case. While not relevant to this particular query, the following two fields also need attention and proper casting: <pre><code>`tournament_id` varchar(255) `match_time` varchar(255) `match_date` varchar(255) `static_id` varchar(255) `fix_id` varchar(255) `localteam_id` varchar(255) `visitorteam_id` varchar(255) </code></pre> <hr> Onto improving the query… As I read it, you're ordering <code>matches_of_comments</code> by latest comment. You also want the number of comments, so we start by doing that. Assuming you're paginating the first 10 of many, the query goes like this: <pre><code>SELECT match_static_id, MAX( TIMESTAMP ) maxtimestamp, COUNT( match_static_id ) AS comments_no FROM comments GROUP BY match_static_id ORDER BY maxtimestamp DESC LIMIT 10 OFFSET 0 </code></pre> That's all. It gives you 10 IDs — more if you increase the limit. Loop through them in your app and build an <code>in (…)</code> clause that will allow you to fetch each individual bit of data from the other tables as needed; you can do this with one or several queries, it matters little. The point is to avoid joining on that aggregate, so that indexes are available for the follow-up queries. <hr> You could improve things more dramatically yet, by removing the above-mentioned query entirely. To do so, add three fields to <code>matches_of_comments</code>, namely <code>last_comment_timestamp</code>, <code>last_comment_user_id</code>, and <code>num_comments</code>. Maintain them using triggers, and add an index on <code>(flag, last_comment_timestamp)</code>. This will allow you to run the following efficient query instead: <pre><code>SELECT matches_of_comments.static_id, matches_of_comments.num_comments, matches_of_comments.last_comment_timestamp, matches_of_comments.last_comment_user_id, matches_of_comments.localteam_name, matches_of_comments.visitorteam_name, matches_of_comments.localteam_goals, matches_of_comments.visitorteam_goals, matches_of_comments.match_status FROM matches_of_comments WHERE matches_of_comments.flag = 1 ORDER BY matches_of_comments.last_comment_timestamp DESC LIMIT 10 OFFSET 0 </code></pre> You then need to only select the needed data from <code>users</code> and <code>new_iddaa</code> — using separate queries with an <code>in (…)</code> clause as already discussed.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow can I improve this query depending on the explain results
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USDenis de Bernardy
UserOwnerUserId
1. USDenis de Bernardy
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POHow can I improve this query depending on the explain results
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTBountyClose
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.