Note that there are some explanatory texts on larger screens.

plurals
  1. POPHP forking and processing MySQL database without conflict
    primarykey
    data
    text
    <p>I have a MySQL database table that I need to process. It takes about 1 second to process 3 rows (due to CURL connections I need to make for each row). So, I need to fork the PHP script in order to have a reasonable time (since I will process up to 10,000 rows for one batch).</p> <p>I'm going to run 10-30 processes at once, and obviously I need some way to make sure that processes are not overlapping (in terms of which rows they are retrieving and modifying).</p> <p>From what I've read, there are three ways to accomplish this. I'm trying to decide which method is best for this situation.</p> <p><strong>Option 1:</strong> Begin a transaction and use <code>SELECT ... FOR UPDATE</code> and limit the # of rows for each process. Save the data to an array. Update the selected rows with a status flag of "processing". Commit the transaction and then update the selected rows to a status of "finished".</p> <p><strong>Option 2:</strong> Update a certain number of rows with a status flag of "processing" and the process ID. Select all rows for that process ID and flag. Work with the data like normal. Update those rows and set the flag to "finished".</p> <p><strong>Option 3:</strong> Set a <code>LIMIT ... OFFSET ...</code> clause for each process's <code>SELECT</code> query, so that each process gets unique rows to work with. Then store the row IDs and perform and <code>UPDATE</code> when done.</p> <p>I'm not sure which option is the safest. I think option 3 seems simple enough, but I wonder is there any way this could fail? Option 2 also seems very simple, but I'm not sure if the locking due to the <code>UPDATE</code> cause everything to slow down. Option 1 seems like the best bet, but I'm not very familiar with <code>FOR UPDATE</code> and transactions, and could use some help.</p> <p><strong>UPDATE:</strong> For clarity, I have currently just one file process.php which selects all the rows and posts the data to a third-party via Curl one-by-one. I'd like to have a fork in this file, so the 10,000 rows can be split among 10-30 child processes.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload