Purge fake visits count


(Landscape Routes) #1

Hello,

In the visits log report of Matomo for my website, a considerable number of visits are not from real person visitors.
Matomo does a good job at filtering most of the crawler bots and page speed tools but not all. Example: the page speed tests in Pingdom do not show up in the log but the same page speed test in Gtmetrix does show up in the log reports.

The way I am able to distinguish the bots visits is by the amount of time spent on a page. So, all these page speed tests and other bots spend a total of 0 (zero) seconds on the page.

Is there a way to filter out (purge) from the logs the visits / visitors that spend zero seconds on the visited pages?

Thanks.


(Lukas Winkler) #2

That’s because Pingdom only fetches the HTML, while gtmetric use a real automated browser that acts like a normal one. If they don’t send a useragent that makes it possible to detect them as a bot, I can’t think of a way to detect them.

Keep in mind that 0 seconds on the page do not mean that it is a bot. Matomo calculates the time on page by simply checking when the next page was visited and taking the difference. So if someone only visited one page there is no way to know how long they stayed there.
Unless you activate the Heartbeattimer where Matomo sends a request every x seconds that the visitor is still on the page.
https://matomo.org/faq/how-to/faq_21824/
https://developer.matomo.org/guides/tracking-javascript-guide#accurately-measure-the-time-spent-on-each-page

You can either filter visitors via segments or completly delete them from the logs via the GDPR tools.

But keep in mind that you have to invalidate and regenerate the generated reports afterwards to reflect the changes (and this is only possible if you still have the raw visitor log).


(Landscape Routes) #3

Thanks a lot, this really helps.
I can delete the visits from the IP addresses related to the Gmetrix in GDPR and include them in exclude.

And I agree that it’s difficult to tell if it’s a real person or not if the Gmetrix tries to emulate a real person visiting the site.