Antibot?


(Utah Carl Jr.) #1

How can I ensure that Piwik does not count visits from web crawlers like googlebot.com and crawl.yahoo.com?

Of course these are not web searches. They are automated visits.


(vipsoft) #2

Most (if not all) bots don’t load the Javascript tracking code, so they (generally) aren’t counted.

The exception might be bots that crawl your site to index images. If the bot actually fetches the image (to cache and/or create a thumbnail), you can mitigate this by adding a robots.txt file on your piwik server, e.g.,

User-agent: *
Disallow: /piwik/


#3

Is it that adding that robot.txt will prevent them from loading the tracker code ?

I recently got a visitor wich provider was :

http://img12.imageshack.us/img12/2765/09072009013526.jpg


(virogo) #4

[quote=TulipVorlax @ Jul 9 2009, 05:47 AM]Is it that adding that robot.txt will prevent them from loading the tracker code ?

I recently got a visitor wich provider was :

http://img12.imageshack.us/img12/2765/09072009013526.jpg[/quote]

Robots.txt merely is just a trafficlight…
I always walk thru the red light…

So the answer: No. You even give them your safespots by telling them…
I’ve been indexed dozens of times by bots you never heared of. So for that I made a script (not Piwik related) for my own website to block all incoming traffic like a firewall based on either IP, RBL or header-info.

If you log headers and can see which bot it is, you can look it up at http://www.botsvsbrowsers.com/
Yet most of the times the bot is coming from a home-computer which is infected.

I have made a script to mail me the moment it is triggered.
SHOULD NOT SEE ME!!! is coming once in a while into my mailbox…

This script resides in a map called "snsm"
The ONLY reference to that map is in: … robots.txt

So you see, I just told a bot there is something there…