Note that there are some explanatory texts on larger screens.

plurals
  1. POPHP Generate List of 301 redirects from CSV, and then Check List of 301 redirects for 404 errors
    text
    copied!<p>I had an interesting task today and couldn't find much on the subject. <strong>I wanted to share this, and ask for any suggestions on how this could have been done more elegantly. I consider myself a mediocre programmer who really wants to improve so any feedback is highly appreciated. There is also a strange bug I can't figure out.</strong> So here goes..and hopefully this helps someone who ever has to do something similar.</p> <p>A client was redoing a site, moving content around, and had a couple thousand redirects that needed to be made. Marketing sent me an XLS with old URLs in one column, new URLs in the next. These were the actions I took:</p> <ul> <li>Saved the XLS as CSV</li> </ul> <p>Wrote a script which:</p> <ul> <li>Formatted the list as valid 301 redirects</li> <li>Exported the list to a text file</li> </ul> <p>I then copy / pasted all the new directives into my .htaccess file.</p> <p>Then, I wrote another script that checked to make sure each of the new links was valid (no 404s). The first script worked exactly as expected. <strong>For some reason, I can get the second script to print out all the 404 errors (there were several), but the script doesn't die when it finishes traversing the loop, and it doesn't write to the file, it just hangs in command line. No errors get reported. Any idea what's going on?</strong> Here is the code for both scripts:</p> <p>Formatting 301s:</p> <pre><code>&lt;?php $source = "301.csv"; $output = "301.txt"; //grab the contents of the source file as an array, prepare the output file for writing $sourceArray = file($source); $handleOutput = fopen($output, "w"); //Set the strings we want to replace in an array. The first array are the original lines and the second are the strings to be replaced $originalLines = array( 'http://hipaasecurityassessment.com', ',' ); $replacementStrings = array( '', ' ' ); //Split each item from the array into two strings, one which occurs before the comma and the other which occurs after function setContent($sourceArray, $originalLines = array(), $replacementStrings = array()){ $outputArray = array(); $text = 'redirect 301 '; foreach ($sourceArray as $number =&gt; $item){ $pattern = '/[,]/'; $item = preg_split($pattern, $item); $item = array( $item[0], preg_replace('#"#', '', $item[1]) ); $item = implode(' ', $item); $item = str_replace($originalLines, $replacementStrings, $item); array_push($outputArray,$text,$item); } $outputString = implode('', $outputArray); return $outputString; } //Invoke the set content function $outputString = setContent($sourceArray, $originalLines, $replacementStrings); //Finally, write to the text file! fwrite($handleOutput, $outputString); </code></pre> <p>Checking for 404s:</p> <pre><code>&lt;?php $source = "301.txt"; $output = "print404.txt"; //grab the contents of the source file as an array, prepare the output file for writing $sourceArray = file($source); $handleOutput = fopen($output, "w"); //Split each item from the array into two strings, one which occurs before the space and the other which occurs after function getUrls($sourceArray = array()){ $outputArray = array(); foreach ($sourceArray as $number =&gt; $item){ $item = str_replace('redirect 301', '', $item); $pattern = '#[ ]+#'; $item = preg_split($pattern, $item); $item = array( $item[0], $item[1], $item[2] ); array_push($outputArray, $item[2]); } return $outputArray; } //Check each URL for a 404 error via a curl request function check404($url = array(), $handleOutput){ $handle = curl_init($url); curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE); $content = curl_exec( $handle ); $response = curl_getinfo( $handle ); $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE); if($httpCode == 404) { //fwrite($handleOutput, $url); print $url; } }; $outputArray = getUrls($sourceArray); foreach ($outputArray as $url) { $errors = check404($url, $handleOutput); } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload