Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You could try importing the <a href="http://www.robotstxt.org/db.html" rel="nofollow noreferrer">Robots database off robotstxt.org</a> and using that to filter out requests from those User-Agents. Might not be much different to User-agents.org, but at least the robotstxt.org list is 'owner-submitted' (supposedly).</p> <p>That site also links to <a href="http://www.botsvsbrowsers.com/" rel="nofollow noreferrer">botsvsbrowsers.com</a> although I don't immediately see a downloadable version of their data.</p> <p>Also, you said</p> <blockquote> <p>I don't want to check every referer against thousands of links.</p> </blockquote> <p>which is fair enough - but if runtime performance is a concern, just 'log' every request and filter them out as a post-process (an overnight batch, or as part of the reporting queries). </p> <p>This point also confuses me a bit</p> <blockquote> <p>preferably still work if someone has javascript disabled.</p> </blockquote> <p>are you writing your log on the server-side as part of every page you serve? javascript should not make any difference in this case (although obviously those with javascript disabled will not get reported via Google Analytics).</p> <p>p.s. having mentioned robotstxt.org, it's worth remembering that well-behaved robots will request <code>/robots.txt</code> from your website root. Perhaps you could use that knowledge to your advantage - by logging/notifying you of possible robot User-Agents that you might want to exclude (although I wouldn't <em>automatically</em> exclude that UA in case a regular web user types /robots.txt into their browser, which might cause your code to ignore real people). I don't think that would cause too much maintenance overhead over time...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload