StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POWhat is a fast way to preview a MySQL join?
primarykey
Id
15032061
data
AcceptedAnswerId
15033302
AnswerCount
1
ClosedDate
CommentCount
4
CommunityOwnedDate
CreationDate
2013-02-22T19:46:02.727
FavoriteCount
0
LastActivityDate
2013-02-22T21:16:44.530
LastEditDate
2013-02-22T20:18:49.777
LastEditorUserId
67960
OwnerUserId
67960
ParentId
0
PostTypeId
1
Score
1
ViewCount
142
LastEditorDisplayName
text
Body
<p>I'm working on a project involving joins between datasets and we have a requirement to allow previews of arbitrary joins between arbitrary datasets. Which is crazy, but thats why its fun. This is use facing so given a join I want to show ~10 rows of results quickly.</p> <p>I've been basing my experimentation around different ways to sub-sample the different tables in such a way that I get at least a few result rows but keep the samples small enough that the join is fast and not cause the sampling to be expensive. </p> <p>Here are the methods I've found pass the smell test. I would like to know a few things about them:</p> <ol> <li>What types of joins or datasets would these fail at?</li> <li>How could I identify those datasets?</li> <li>If both of these are bad at the same thing, how could they be improved?</li> <li>Is there a type of sampling I have not put here that is better? </li> </ol> <h3>Subselect with a limit.</h3> <p>Takes a random sample of one dataset to reduce the overall size.</p> <pre><code>SELECT col1, col2 FROM table1 JOIN (SELECT col1, col2 FROM table2 LIMIT #) AS sample2 on table1.col1 = sample2.col1 LIMIT 10; </code></pre> <p>I like this because its easy and there is potential in the future to be smart about which table to samples from. It is also possible to select a portion where table1.col1 never equals sample2.col1 so no results are returned.</p> <h3>Find equals values of col1 and Sample them</h3> <p>More complicated, multi-query approach. Here I would do a distinct select of the columns to join on, compare the results to find common values and then do a subselect limiting the results to the common values.</p> <pre><code>SELECT DISTINCT col1 FROM table1; SELECT DISTINCT col1 FROM table2; commonVals = intersection of above results SELECT col1, col2 FROM table1 JOIN (SELECT col1, col2 FROM table2 WHERE col1 IN(commonVals) LIMIT #) as sample2 on table1.col1 = sample2.col1 LIMIT 10; </code></pre> <p>This gets us a good sample of table2, but the select distinct query may be more expensive than the join. I believe there may be a way to determine if this method is faster if you knew something about how long the distinct cals would take but at this point we don't have that much knowledge of the datasets.</p> <h3>Slap a LIMIT on the join</h3> <p>This is the easiest and the one I'm leaning towards.</p> <pre><code>SELECT col1, col1 FROM table1 join table2 on table1.col1 = table2.col1 LIMIT # </code></pre> <p>Assuming the join is good, this will always return data and for at least a large set of cases it will do it fast. </p>
Tags
<mysql><sql><optimization><join>
Title
What is a fast way to preview a MySQL join?
singulars
PostAcceptedAnswerId
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USPatrick Auld
UserOwnerUserId
1. USPatrick Auld
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  POWhat is a fast way to preview a MySQL join?
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.