Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL: Rewrite MSSQL with correlated subquery in FROM clause?
    primarykey
    data
    text
    <p>We have a table that contains website page views, like:</p> <pre><code>time | page_id ----------|----------------------------- 1256645862| pageA 1256645889| pageB 1256647199| pageA 1256647198| pageA 1256647300| pageB 1257863235| pageA 1257863236| pageC </code></pre> <p>In our production table, there is currently about 40K rows. We want to generate, for each day, the count of <em>unique</em> pages viewed in the last 30 days, 60 days, and 90 days. So, in the result set, we can look-up a day, and see how many <em>unique</em> pages were accessed within the 60-day period preceding that day.</p> <p>We were able to get a query to work in MSSQL:</p> <pre><code>SELECT DISTINCT CONVERT(VARCHAR,P.NDATE,101) AS 'DATE', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE NDATE BETWEEN DATEADD(D,-29,P.NDATE) AND P.NDATE) AS SUB) AS '30D', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE NDATE BETWEEN DATEADD(D,-59,P.NDATE) AND P.NDATE) AS SUB) AS '60D', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE NDATE BETWEEN DATEADD(D,-89,P.NDATE) AND P.NDATE) AS SUB) AS '90D' FROM PERFLOG P ORDER BY 'DATE' </code></pre> <p>NOTE: because MSSQL doesn't have the FROM_UNIXTIME function, we added the NDATE column for testing, which is just the converted <code>time</code>. NDATE does not exist in the production table.</p> <p>Converting this query to MySQL gives us the "Unknown colum P.time" error:</p> <pre><code>SELECT DISTINCT FROM_UNIXTIME(P.time,'%Y-%m-%d') AS 'DATE', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE FROM_UNIXTIME(time,'%Y-%m-%d') BETWEEN DATE_SUB(FROM_UNIXTIME(P.time,'%Y-%m-%d'), INTERVAL 30 DAY) AND FROM_UNIXTIME(P.time,'%Y-%m-%d')) AS SUB) AS '30D', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE FROM_UNIXTIME(time,'%Y-%m-%d') BETWEEN DATE_SUB(FROM_UNIXTIME(P.time,'%Y-%m-%d'), INTERVAL 60 DAY) AND FROM_UNIXTIME(P.time,'%Y-%m-%d')) AS SUB) AS '60D', (SELECT COUNT(DISTINCT SUB.PAGE_ID) FROM (SELECT PAGE_ID FROM perflog WHERE FROM_UNIXTIME(time,'%Y-%m-%d') BETWEEN DATE_SUB(FROM_UNIXTIME(P.time,'%Y-%m-%d'), INTERVAL 90 DAY) AND FROM_UNIXTIME(P.time,'%Y-%m-%d')) AS SUB) AS '90D' FROM PERFLOG P ORDER BY 'DATE' </code></pre> <p>I understand this is because we cannot have a correlated subquery that references a table in the outer FROM clause. But, unfortunately, we are at a loss on how to convert this query to work in MySQL. For now, we simply return all DISTINCT rows from the table and post-process it in PHP. Takes about 2-3 seconds for 40K rows. I'm worried about the performance when we have 100's of 1000's of rows.</p> <p>Is it possible to do in MySQL? If so, can we expect it to perform better than our PHP post-processed solution.</p> <p><strong>UPDATE:</strong> Here's the query for creating the table:</p> <pre><code>CREATE TABLE `perflog` ( `user_id` VARBINARY( 40 ) NOT NULL , `elapsed` float UNSIGNED NOT NULL , `page_id` VARCHAR( 255 ) NOT NULL , `time` INT( 10 ) UNSIGNED NOT NULL , `ip` VARBINARY( 40 ) NOT NULL , `agent` VARCHAR( 255 ) NOT NULL , PRIMARY KEY ( `user_id` , `page_id` , `time` , `ip`, `agent` ) ) ENGINE MyISAM </code></pre> <p>Our production table has ~40K rows thus far!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload