Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You're returning a cross product with any record with a null class. Is this OK for your results?</p> <p>I created two sample query in 11gR2:</p> <pre><code>WITH a as (select NULL as class, 5 as columna from dual UNION select NULL as class, 7 as columna from dual UNION select NULL as class, 9 as columna from dual UNION select 'X' as class, 3 as columna from dual UNION select 'Y' as class, 2 as columna from dual), b as (select NULL as class, 2 as columnb from dual UNION select NULL as class, 15 as columnb from dual UNION select NULL as class, 5 as columnb from dual UNION select 'X' as class, 7 as columnb from dual UNION select 'Y' as class, 9 as columnb from dual) SELECT * from a JOIN b ON (a.class = b.class OR (a.class is null AND b.class is null)) </code></pre> <p>When I run EXPLAIN PLAN on this query, it indicates the tables (inline views in my case) are joined via NESTED LOOPS. NESTED LOOPS joins operate by scanning the first row of one table, then scanning each row of the other table for matches, then scanning the second row of the first table, looks for matches on the second table, etc. Because you are not directly comparing either table in the OR portion of your JOIN, the optimizer must use NESTED LOOPS.</p> <p>Behind the scenes it may look something like:</p> <ul> <li>Get Table A, row 1. If class is null, include this row from Table A on the result set.</li> <li>While still on Table A Row 1, Search table B for all rows where class is null.</li> <li>Perform a cross product on Table A Row 1 and all rows found in Table B</li> <li>Include these rows in the result set </li> <li>Get Table A, row 2. If class is null, include this row from Table A on the result set.</li> <li>.... etc</li> </ul> <p>When I change the SELECT statement to <code>SELECT * FROM a JOIN b ON NVL(a.class, 'N/A') = NVL(b.class, 'N/A')</code>, EXPLAIN indicates that a HASH JOIN is used. A hash join essentially generates a hash of each join key of the smaller table, and then scans the large table, finding the hash in the smaller table for each row that matches. In this case, since it's a simple Equijoin, the optimizer can hash each row of the driving table without problems.</p> <p>Behind the scenes it may look something like:</p> <ul> <li>Go through table A, converting NULL class values to 'N/A'</li> <li>Hash each row of table A as you go.</li> <li>Hash Table A is now in temp space or memory.</li> <li>Scan table B, converting NULL class values to 'N/A', then computing hash of value. Lookup hash in hash table, if it exists, include the joined row from Table A and B in the result set.</li> <li>Continue scanning B.</li> </ul> <p>If you run an EXPLAIN PLAN on your queries, you probably will find similar results.</p> <p>Even though the end result is the same, since you aren't <em>joining</em> the tables in the first query with "OR", the optimizer can't use a better join methodology. NESTED LOOPS can be very slow if the driving table is large or if you are forcing a full table scan against a large secondary table. </p> <p>You can use the ANSI <code>COALESCE</code> function to emulate the NVL oracle function in other database systems. The real issue here is that you're attempting to join on a NULL value, where you really should have a "NO CLASS" or some other method of identifying a "null" class in the sense of null = nothing instead of null = unknown.</p> <p><strong>Addendum to answer your question in the comments:</strong></p> <p><strong>For the null query the SQL engine will do the following:</strong></p> <ol> <li>Read Row 1 from Table A, class is null, convert to 'N/A'.</li> <li>Table B has 3 Rows which have class is null, convert each null to 'N/A'. </li> <li>Since the first row matches to all 3 rows, 3 rows are added to our result set, one for A1B1, A1B2, A1B3.</li> <li>Read Row 2 From Table A, class is null, convert to 'N/A'/</li> <li>Table B has 3 Rows which have class is null, convert each null to 'N/A'. </li> <li>Since the second row matches to all 3 rows, 3 rows are added to our result set, one for A2B1, A2B2, A2B3.</li> <li>Read Row 3 From Table A, class is null, convert to 'N/A'/</li> <li>Table B has 3 Rows which have class is null, convert each null to 'N/A'. </li> <li>Since the third row matches to all 3 rows, 3 rows are added to our result set, one for A3B1, A3B2, A3B3. 10.. Rows 4 and 5 aren't null so they won't be processed in this portion of the join.</li> </ol> <p><strong>For the 'N/A' query, the SQL engine will do the following:</strong></p> <ol> <li>Read Row 1 from Table A, class is null, convert to 'N/A', hash this value.</li> <li>Read Row 2 from Table A, class is null, convert to 'N/A', hash this value.</li> <li>Read Row 3 from Table A, class is null, convert to 'N/A', hash this value.</li> <li>Read Row 4 from Table A, class not null, hash this value.</li> <li>Read Row 5 from Table A, class not null, hash this value.</li> <li>Hash table C is now in memory.</li> <li>Read Row 1 from Table B, class is null, convert to 'N/A', hash the value.</li> <li>Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B1, A2B1, A3B1.</li> <li>Read Row 2 from Table B, class is null, convert to 'N/A', hash the value.</li> <li>Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B2, A2B2, A3B2.</li> <li>Read Row 3 from Table B, class is null, convert to 'N/A', hash the value.</li> <li>Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B3, A2B3, A3B3.</li> </ol>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload