Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL: Suggesting objects (optimizing a multi-join query)
    primarykey
    data
    text
    <p><b>Goal:</b> Suggest objects based on user's choices</p> <p><b>Data:</b> Table containing info on how users would order a subset of objects from the worst to the best; Example:</p> <pre><code> 1 2 3 4 5 6 John: A B G J S O Mary: A C G L Joan: B C L J K Stan: G J C L </code></pre> <p>There's about 20 times as many users as objects, every user's lineup contains 50-200 objects.</p> <p><b>The table:</b></p> <pre><code>CREATE TABLE IF NOT EXISTS `pref` ( `usr` int(10) unsigned NOT NULL, `obj` int(10) unsigned NOT NULL, `ord` int(10) unsigned NOT NULL, UNIQUE KEY `u_o` (`usr`,`obj`), KEY `u` (`usr`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; </code></pre> <p><b>Basic idea:</b> Iterate within user's objects starting from the second worst, building pairs (A > B); look for them in other users' lineups and list items better than A according to those users.</p> <p><b>Query:</b></p> <pre><code>SELECT e.obj, COUNT(e.obj) AS rate FROM pref a, pref b, pref c, pref d, pref e WHERE a.usr = '222' # step 1: select a pair of objects A, B, where A is better than B according to user X AND a.obj = '111' AND b.usr = a.usr AND b.ord &lt; a.ord AND c.obj = a.obj # step 2: find users thinking that object A is better than B AND d.obj = b.obj AND d.ord &lt; c.ord AND d.usr = c.usr AND e.ord &gt; c.ord # step 3: find objects better than A according to these users AND e.usr = c.usr GROUP BY e.obj ORDER BY rate DESC; </code></pre> <p>Aliases:<br> <code>a</code> object A ('111'), current user ('222')<br> <code>b</code> object B, worse than A according to current user (has lower value of 'ord' than A)<br> <code>c</code> object A in other user's lineup<br> <code>d</code> object B in other user's lineup<br> <code>e</code> object better than A in other user's lineup<br></p> <p>The execution plan (ouo and uo being the indexes as suggested by <b>Quassnoi</b>):</p> <pre><code>+----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+ | 1 | SIMPLE | a | ref | ouo,uo | ouo | 8 | const,const | 1 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | b | ref | ouo,uo | uo | 4 | const | 86 | Using where | | 1 | SIMPLE | d | ref | ouo,uo | ouo | 4 | db.b.obj | 587 | Using index | | 1 | SIMPLE | c | ref | ouo,uo | ouo | 8 | const,db.d.usr | 1 | Using where; Using index | | 1 | SIMPLE | e | ref | uo | uo | 4 | db.d.usr | 80 | Using where | +----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+ </code></pre> <p>The query seems to work fine as long as the dataset is not too big; any ideas on how to streamline it to support larger datasets?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload