multithreading in php
ok… you got me… there is no such thing as actual multithreading in php. however, i have come up with a method that worked for me on a project where i had to write a web crawler and it the spider hasn’t failed yet. the basic idea is this: while php cannot multithread, your server certainly can so rather than try to jumo through a bunch of hoops to make php do what you want, just rely on the server to do the work.
the parent file
in order to multithread you must have a parent file that creates the threads (or children as i will refer to them for the remainder of this article). you will most likely have to account for the available resources and have the parent file check those resources to make sure that there is enough “room” for another child to be created, otherwise wait until there is “room.” another thing to pay attention to is the default max_execution_time in php. my method to handle both of these issues is to first set php’s max_execution_time to 0 (effectively turning it off) and then to store a running list of children in a database table to keep track of the count.
ini_set('max_execution_time', 0);
$maxThreadsAllowed = 50;
$activeThreads = 0;
$cyclesToRun = 250;
$count = 0;
$keepRunning = TRUE;
while ($keepRunning) {
if ($activeThreads <= $maxThreadsAllowed) {
exec('php -f childThread.php >> threadlog.txt &');
$count++;
}
$sql = "'SELECT count(*) FROM threads WHERE completed = 0';
$result = mysql_query($sql);
$activeThreads = mysql_result($result, 0);
if ($activeThreads == 0 && $count &;t= $cyclesToRun) {
$break;
}
}
initially there are no threads so i set $activeThreads to 0. for the purposes of this article i am assuming that the system will remain stable as long as there are no more than 50 threads running. the first time through the loop will always run since $keepRunning is initially TRUE. as long as there aren’t too many active children, a new child will be created each pass, otherwise it will be skipped and the count will be checked again. once the $activeCount hits 0 and the $count matches the $cyclesToRun the loop will break. you might ask yourself how the $activeCount will ever be more than 0 based on this code, the answer is in the child file.
the child file
the child file is where all the actual actions that are to be performed take place. in this example it will simply say 'Hello World.' the key here is to insert a new row into the threads table right away so that when the parent checks the $activeCount, there will be something there. then do the work. finally, update the row and set completed to 1 so that the thread is not included in the $activeCount.
$pid = getmypid();
$sql = 'INSERT INTO threads (pid, completed) VALUES (' . $pid . ', 0)';
$result = mysql_query($sql);
echo 'Hello World';
$sql = 'UPDATE threads SET completed = 1 WHERE pid = ' . $pid;
$result = mysql_query($sql);
thats it, albeit in a very basic way. the threads table tracks all the currently running children and the parent checks the table to ensure that there aren’t too many threads running. once all the threads are completed and the total number of cycles has been fulfilled the loop will break and the parent will stop.
assumptions
i am assuming that you already have a connection to a database and that whichever user php is running as has the proper permissions necessary to run exec() commands, user ini_set() and write to a threadlog.txt file




