StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>The way I try to solve SQL problems is to take things step by step.</p> <ul> <li>You want the maximum revision for the maximum minor version corresponding to the maximum major version for each product.</li> </ul> <p>The maximum major number for each product is given by:</p> <pre><code>SELECT Name, MAX(major) AS Major FROM CA GROUP BY Name; </code></pre> <p>The maximum minor number corresponding to the maximum major number for each product is therefore given by:</p> <pre><code>SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor FROM CA JOIN (SELECT Name, MAX(Major) AS Major FROM CA GROUP BY Name ) AS CB ON CA.Name = CB.Name AND CA.Major = CB.Major GROUP BY CA.Name, CA.Major; </code></pre> <p>And the maximum revision (for the maximum minor version number corresponding to the maximum major number for each product), therefore, is given by:</p> <pre><code>SELECT CA.Name, CA.Major, CA.Minor, MAX(CA.Revision) AS Revision FROM CA JOIN (SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor FROM CA JOIN (SELECT Name, MAX(Major) AS Major FROM CA GROUP BY Name ) AS CB ON CA.Name = CB.Name AND CA.Major = CB.Major GROUP BY CA.Name, CA.Major ) AS CC ON CA.Name = CC.Name AND CA.Major = CC.Major AND CA.Minor = CC.Minor GROUP BY CA.Name, CA.Major, CA.Minor; </code></pre> <p>Tested - it works and produces the same answer as <a href="https://stackoverflow.com/users/50552/andomar">Andomar</a>'s <a href="https://stackoverflow.com/questions/8723404/how-to-select-single-row-based-on-the-max-value-in-multiple-rows/8723520#8723520">query</a> does.</p> <hr> <h3>Performance</h3> <p>I created a bigger volume of data (11616 rows of data), and ran a benchmark timing of Andomar's query against mine - target DBMS was IBM Informix Dynamic Server (IDS) version 11.70.FC2 running on MacOS X 10.7.2. I used the first of Andomar's two queries since IDS does not support the comparison notation in the second one. I loaded the data, updated statistics, and ran the queries both with mine followed by Andomar's and with Andomar's followed by mine. I also recorded the basic costs reported by the IDS optimizer. The result data from both queries were the same (so the queries are both accurate - or equally inaccurate).</p> <p>Table unindexed:</p> <pre><code>Andomar's query Jonathan's query Time: 22.074129 Time: 0.085803 Estimated Cost: 2468070 Estimated Cost: 22673 Estimated # of Rows Returned: 5808 Estimated # of Rows Returned: 132 Temporary Files Required For: Order By Temporary Files Required For: Group By </code></pre> <p>Table with unique index on (name, major, minor, revision):</p> <pre><code>Andomar's query Jonathan's query Time: 0.768309 Time: 0.060380 Estimated Cost: 31754 Estimated Cost: 2329 Estimated # of Rows Returned: 5808 Estimated # of Rows Returned: 139 Temporary Files Required For: Group By </code></pre> <p>As you can seen, the index dramatically improves the performance of Andomar's query, but it still seems to be more expensive on this system than my query. The index gives a 25% time saving for my query. I'd be curious to see comparable figures for the two versions of Andomar's query on comparable volumes of data, with and without the index. (My test data can be supplied if you need it; there were 132 products - the 3 listed in the question and 129 new ones; each new product had (the same) 90 version entries.)</p> <p>The reason for the discrepancy is that the sub-query in Andomar's query is a correlated sub-query, which is a relatively expensive process (dramatically so when the index is missing).</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload