StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POSelect a random row from result set, for update without order by RAND()
text
Body
copied!<p>This procedure is the most frequently accessed in this target application. Assume concurrent operations, and t.value is always changing.</p> <pre><code>-- Table is MySQL InnoDB -- let's call this MainSelect SELECT t.Id FROM table t WHERE t.A = conditionA AND t.B = conditionB AND t.value > 0 ORDER BY RAND() LIMIT 1 INTO vIndex FOR UPDATE; -- IF vIndex THEN UPDATE table SET value = value - 1 WHERE id = vIndex </code></pre> <p>The goal is to modify this query to use a form of <a href="https://stackoverflow.com/a/4329447/1464716">this method of random row selection</a> for its speed. Here for completeness. This is the main question of this post.</p> <pre><code>SELECT name FROM random AS r1 JOIN (SELECT (RAND() * (SELECT MAX(id) FROM random)) AS id) AS r2 WHERE r1.id >= r2.id ORDER BY r1.id ASC LIMIT 1 </code></pre> <p>Discussion:</p> <p>How would the total number of rows in MainSelect determined?</p> <p>If the answer to this is to make the MainSelect subquery, moving the FOR UPDATE to the outer most query, the randomly selected row's t.value could become 0 before the outer SELECT locks the row with the FOR UPDATE. Something like:</p> <p><code>SELECT * FROM (firstquery) s ...random selection logic.. FOR UPDATE;</code></p> <p>If this consideration is accurate, this places the question on which transaction level should be set at the onset.</p> <p>Thanks</p> <p><strong>Edit</strong> - Notes while working:</p> <ol> <li><p>Maybe <a href="http://en.wikipedia.org/wiki/Reservoir_sampling" rel="nofollow noreferrer">http://en.wikipedia.org/wiki/Reservoir_sampling</a>, because the count is unknown. I'd like to avoid high isolation levels as I would expect lower throughput as a result.</p></li> <li><p>Perhaps a random number could be stored and indexed, rather than computed. Then a random number is selected, and according to <a href="http://dev.mysql.com/doc/refman/5.0/en/limit-optimization.html" rel="nofollow noreferrer">limit documentation</a>, selecting one at random is very fast. The problem with this is the result set will not be uniform.</p></li> </ol> <blockquote> <p>If you use LIMIT row_count with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result. If ordering is done by using an index, this is very fast.</p> </blockquote>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload