Note that there are some explanatory texts on larger screens.

plurals
  1. POExcluding bad bots via htaccess: why I get http 500 errors?
    primarykey
    data
    text
    <p>I'm blocking bad and useless bot using:</p> <pre><code>RewriteCond %{HTTP_USER_AGENT} ^$ [OR] RewriteCond %{HTTP_USER_AGENT} 360Spider [OR] RewriteCond %{HTTP_USER_AGENT} A(?:ccess|ppid) [NC,OR] RewriteCond %{HTTP_USER_AGENT} C(?:apture|lient|opy|rawl|url) [NC,OR] RewriteCond %{HTTP_USER_AGENT} D(?:ata|evSoft|o(?:main|wnload)) [NC,OR] RewriteCond %{HTTP_USER_AGENT} E(?:ngine|zooms) [NC,OR] RewriteCond %{HTTP_USER_AGENT} f(?:etch|ilter) [NC,OR] RewriteCond %{HTTP_USER_AGENT} genieo [NC,OR] RewriteCond %{HTTP_USER_AGENT} Ja(?:karta|va) [NC,OR] RewriteCond %{HTTP_USER_AGENT} Li(?:brary|nk|bww) [NC,OR] RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR] RewriteCond %{HTTP_USER_AGENT} nutch [NC,OR] RewriteCond %{HTTP_USER_AGENT} Pr(?:oxy|ublish) [NC,OR] RewriteCond %{HTTP_USER_AGENT} robot [NC,OR] RewriteCond %{HTTP_USER_AGENT} s(?:craper|istrix|pider) [NC,OR] RewriteCond %{HTTP_USER_AGENT} W(?:get|(?:in(32|Http))) [NC] RewriteRule .? - [F] </code></pre> <p>Complete htaccess file:</p> <pre><code>AddDefaultCharset UTF-8 RewriteEngine on #inherit from root htaccess and append at last, necessary in root too RewriteOptions inherit #block bad bots RewriteCond %{HTTP_USER_AGENT} ^$ [OR] RewriteCond %{HTTP_USER_AGENT} 360Spider [OR] RewriteCond %{HTTP_USER_AGENT} A(?:ccess|ppid) [NC,OR] RewriteCond %{HTTP_USER_AGENT} C(?:apture|lient|opy|rawl|url) [NC,OR] RewriteCond %{HTTP_USER_AGENT} D(?:ata|evSoft|o(?:main|wnload)) [NC,OR] RewriteCond %{HTTP_USER_AGENT} E(?:ngine|zooms) [NC,OR] RewriteCond %{HTTP_USER_AGENT} f(?:etch|ilter) [NC,OR] RewriteCond %{HTTP_USER_AGENT} genieo [NC,OR] RewriteCond %{HTTP_USER_AGENT} Ja(?:karta|va) [NC,OR] RewriteCond %{HTTP_USER_AGENT} Li(?:brary|nk|bww) [NC,OR] RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR] RewriteCond %{HTTP_USER_AGENT} nutch [NC,OR] RewriteCond %{HTTP_USER_AGENT} Pr(?:oxy|ublish) [NC,OR] RewriteCond %{HTTP_USER_AGENT} robot [NC,OR] RewriteCond %{HTTP_USER_AGENT} s(?:craper|istrix|pider) [NC,OR] RewriteCond %{HTTP_USER_AGENT} W(?:get|(?:in(32|Http))) [NC] RewriteRule .? - [F] #include caching for images &lt;IfModule mod_expires.c&gt; ExpiresActive On ExpiresByType image/gif "access plus 1 month" ExpiresByType image/jpeg "access plus 1 month" ExpiresByType image/png "access plus 1 month" ExpiresByType image/x-icon "access plus 360 days" ExpiresByType text/css "access plus 1 day" ExpiresByType text/html "access plus 1 week" ExpiresByType text/javascript "access plus 1 week" ExpiresByType text/x-javascript "access plus 1 week" ExpiresByType application/javascript "access plus 1 week" ExpiresByType application/x-javascript "access plus 1 week" ExpiresByType application/x-shockwave-flash "access plus 1 week" ExpiresByType font/truetype "access plus 1 month" ExpiresByType font/opentype "access plus 1 month" ExpiresByType application/x-font-otf "access plus 1 month" &lt;/IfModule&gt; RewriteCond %{HTTP_HOST} ^nix.foo.com$ [OR] RewriteCond %{HTTP_HOST} ^www.nix.foo.com$ RewriteRule ^(.*)$ "http\:\/\/www\.foo\.com\/nix\.php" [R=301,L] RewriteCond %{HTTP_HOST} ^gallery.foo.com$ [OR] RewriteCond %{HTTP_HOST} ^www.gallery.foo.com$ RewriteRule ^(.*)$ "http\:\/\/www\.foo\.com\/gallery\.php" [R=301,L] RewriteCond %{HTTP_HOST} ^blog.foo.com$ [OR] RewriteCond %{HTTP_HOST} ^www.blog.foo.com$ RewriteRule ^(.*)$ "http\:\/\/www\.foo\.com\/blog" [R=301,L] RewriteCond %{HTTP_HOST} ^id.foo.com$ [OR] RewriteCond %{HTTP_HOST} ^www.id.foo.com$ RewriteRule ^/?$ "http\:\/\/foo\.myopenid\.com\/" [R=301,L] redirect 301 /map.php http://www.foo.com/maps/map.php RedirectMatch 301 ^/(map(?!pa_area51\.)[^/.]+\.php)$ http://www.foo.com/maps/$1 Options +FollowSymLinks RewriteCond %{HTTP_HOST} !^www\. RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L] </code></pre> <p>It worked good (http 403) until I switched from a Litespeed webserver hosting to an Apache's one. They're both shared hosting services. Now I get:</p> <pre><code>Forbidden You don't have permission to access /robots.txt on this server. Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request. </code></pre> <p>Here's a sample from access log:</p> <pre><code>208.115.111.68 - - [22/Sep/2013:17:56:48 +0200] "GET /robots.txt HTTP/1.1" 500 576 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)" </code></pre> <p>Any hints on that http 500 error? Thanks in advance</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload