Bot visits are getting counted

Problem

It seems like some bots have managed to disguise themselves as regular users and get counted by Matomo, because the number of daily visits have risen 15-fold, mainly from bot-looking visits.

Tasks

  1. How can we make sure bots are not getting counted?
    See comment #2 below
  2. How can we delete these false visits?
    See comment #3 below

Observations

Some observations which seems to confirm that these are bots:

The majority report using a 800x600 screen

Some stats, which seem to confirm the bot theory:

  • 295 visits, 295 unique visitors
  • 0s average visit duration -100%
  • 99% visits have bounced (left the website after one page) +4.2%

From the last 400 visits, these ~160 IP’s share the same parent ranges:

154.94.32.109
154.94.32.177
154.94.32.227
154.94.36.107
154.94.36.219
154.94.43.132
154.94.43.151
154.94.47.192
154.94.51.26
154.94.56.133
154.94.63.184
154.94.63.238
172.121.116.204
172.121.117.198
172.121.148.175
172.121.149.143
172.121.172.69
172.121.173.116
172.121.173.134
172.121.173.232
172.121.173.39
172.121.173.64
172.121.184.169
172.121.184.210
172.121.188.151
172.121.188.36
172.121.188.54
172.121.28.139
172.121.28.151
172.121.28.204
172.121.28.204
172.121.28.88
172.121.6.147
172.121.6.96
172.121.72.27
172.121.74.132
172.121.75.183
172.121.75.227
172.121.76.185
172.121.79.189
172.121.81.219
172.121.88.124
172.121.90.130
172.121.90.186
172.121.92.143
172.121.97.106
172.121.97.144
172.252.104.155
172.252.109.190
172.252.109.246
172.252.114.104
172.252.114.211
172.252.114.218
172.252.115.237
172.252.116.57
172.252.122.155
172.252.122.214
172.252.124.5
172.252.124.50
172.252.132.151
172.252.137.229
172.252.153.139
172.252.153.246
172.252.180.153
172.252.182.118
172.252.182.187
172.252.182.207
172.252.182.69
172.252.184.141
172.252.186.125
172.252.186.215
172.252.193.75
172.252.197.198
172.252.197.213
172.252.197.50
172.252.199.212
172.252.199.231
172.252.199.81
172.252.205.82
172.252.206.7
172.252.209.162
172.252.209.85
172.252.216.219
172.252.216.63
172.252.223.17
172.252.223.57
172.252.223.71
172.252.228.81
172.252.237.186
172.252.237.201
172.252.237.37
172.252.238.150
172.252.28.224
172.252.28.99
172.252.40.130
172.252.41.241
172.252.41.69
172.252.42.218
172.252.44.170
172.252.45.105
172.252.45.72
172.252.45.78
172.252.47.122
172.252.54.132
172.252.54.148
172.252.54.2
172.252.55.191
45.206.112.3
45.206.113.118
45.206.114.113
45.206.115.228
45.206.115.63
45.206.116.212
45.206.118.253
45.206.118.92
45.206.120.123
45.206.120.74
45.206.121.167
45.206.122.76
45.206.124.115
45.206.124.154
45.206.124.75
45.206.125.4
45.206.125.59
45.206.126.59
45.206.127.60
45.206.127.95
45.206.80.181
45.206.80.194
45.206.80.82
45.206.81.209
45.206.83.170
45.206.83.202
45.206.84.162
45.206.86.164
45.206.87.25
45.206.87.5
45.206.88.58
45.206.89.206
45.206.90.149
45.206.91.195
45.206.91.39
45.206.92.31
45.206.93.234
45.206.95.210
45.207.166.199
45.207.176.106
45.207.178.102
45.207.179.211
45.207.180.111
45.207.181.225
45.207.183.20
45.207.184.237
45.207.185.232
45.207.187.120
45.207.187.203
45.207.187.252
45.207.189.50
45.207.31.10
45.207.31.184
45.207.31.47
45.207.45.150

I found https://plugins.matomo.org/BotTracker but it seems to rely on the honesty of the bot farm to disclose itself as a bot via the user agent. Sadly, many don’t do this, and try to pass as a regular visitor. So a visit from such a disguised bot will be counted as a regular visit.

I instead added the Tracking Spam Prevention plugin, maybe that can take care of blocking disguised bots from getting counted?

… and enabled these filters:

x Block tracking requests from the cloud
Blocks tracking requests originating from cloud providers like AWS, Azure, Digital Ocean, Google Cloud and Oracle by fetching a list of their IP ranges. It should be safe to turn on if you are only tracking using the JavaScript tracker, as their tracking requests do not orginate from clouds, unless they use a VPN that routes data through cloud providers. The setting applies to all your sites.

x Block headless browsers
These are browsers without a user interface, mostly used for automation. It should be safe to turn this on if you only have regular websites or apps. It can block additional bots and spam requests that otherwise would not be detected.

x Block tracking requests from server-side libraries
Use this if only using JavaScript Tracker, as other traffic will be attacks or spam anyway. It blocks tracking requests from cURL, HTTP, Guzzle, and Postman.
Note: Do not use it if track data using a server-side SDK like the Matomo PHP tracking SDK, Java SDK, Python SDK, Android or iOS SDK, or other server-side programming languages.

About deleting erroneously registered bot visits, I found How to invalidate the past historical reports so they can be re-processed from the logs and used the GDPR tool to delete bot visits via a search for Resoution: 800x600. Adding “OS: Windows” or “Browser: Chrome” was not needed – it seems like the 800x600 is the signal.

I then used the InvalidateReports plugin (Option 1) to invalidate the data, but I could still see the visits … I could have waited for the hourly Cron-job (Note: See https://forum.matomo.org/t/problem-setting-up-cron-job-for-archiving/53403/7 ) to rebuild the data, but triggered it manually, and the bot visits are now deleted.

tldr;

  1. Delete bot visits with GDPR tool (Resoution: 800x600)
  2. Invalidate data with Invalidate Reports plugin
  3. Rebuild data manually, or wait for Cron

Thanks for sharing your process. This indeed seems to be the best way exclude the bots.

You’re welcome. Sadly, the bots are still getting through, even with the Tracking Spam Prevention plugin enabled …

It seems like Matomo used to be able to filter off and ignore bots, but bots have gotten so clever that they circumvent the old methods in Matomo, and even the Tracking Spam Prevention plugin.

So currently, it’s a game of Whac-A-Mole, deleting thousands of fake bot visits manually daily …

I hope Matomo can take a look at this, and maybe consider adding new methods to prevent bots from getting counted? See also Tracking Spam Prevention gets bypassed and Feature request: Exclude by metric.

From what I’ve seen in Matomo setups, what you’re describing is pretty classic “datacenter bot flood” rather than real user traffic.

Those signs you listed all line up:

  • huge spike + 0s avg visit duration

  • 99% bounce rate

  • identical/low-resolution screen size (800x600 is a big red flag)

  • tight clusters of IPs from similar ranges (looks like VPS/cloud providers rotating addresses)

So yeah, your suspicion is very likely correct.

What usually helps in cases like this

First thing I’d check is whether Matomo’s built-in bot filtering is fully enabled. In Matomo, there’s an option under settings to exclude known bots/spiders using the IAB/Spider & Bot list. It’s not perfect, but it does clean up a chunk of this noise.

Next step (more important in your case): IP / range blocking at the edge, not inside analytics.

Since you already spotted whole blocks like 172.121.x.x, 172.252.x.x, 45.206.x.x, that strongly suggests hosted infrastructure. In practice, I’ve had better results blocking at:

  • firewall level (server or provider firewall rules)

  • or Cloudflare (if you’re using it → “Bot Fight Mode” + rate limiting rules)

Blocking entire /16 ranges can be effective, but be a bit careful and confirm you’re not cutting off real traffic from those regions/ISPs.

About “removing” the fake visits

Matomo won’t really let you surgically delete visits from reports in a clean way after the fact (at least not without GDPR log deletion or database-level cleanup, which I wouldn’t recommend unless you really know what you’re doing).

What most people do instead:

  • enable log data retention rules (Privacy settings)

  • or just accept that historical data is polluted and fix forward

  • optionally re-archive reports after cleanup so dashboards reflect new filtering

One more thing worth checking

Sometimes bots bypass “known bot” detection but still behave similarly. In those cases, I’ve had success adding:

  • minimum visit duration filters (ignore 0–1 sec sessions in reports)

  • blocking empty or suspicious user agents at server level

  • throttling abnormal request rates per IP

Honestly, once you combine Matomo bot detection + edge blocking, the noise usually drops dramatically.

What you’re seeing isn’t uncommon lately—there’s been a noticeable increase in low-effort scraping bots hitting analytics tools directly rather than the site itself.

Humanize 378 words

Thanks for your thorough comment, it’s really great.

It seems like Tag Manager could be needed for Matomo’s built-in bot filtering using the IAB/Spider & Bot list? I couldn’t find it … Maybe you can share the links to click, to find that option?

… because I don’t think I am using Tag Manager, and that could be the reason.

But it seems to me like the vast majority of bot visits are not getting counted, so I assume some kind of default bot blocking is active. I see hundred of visits by Google, Anthropic, Bing, etc. per hour, and none of them are getting counted. I created these two Github issues as well:

… and here is my latest comment the other day:

The last few weeks, I have seen a lot of visits by Twitter/X bots (resolution 800x600) getting registered as regular visits:

69.12.58.84
69.12.59.77
69.12.59.101
69.12.57.71
69.12.58.91
69.12.57.97
etc.

https://www.abuseipdb.com/check/69.12.59.101

ASN: https://www.abuseipdb.com/check/AS63179

Out of 400 visits , they used 185 different IP’s

Like I noted in Matomo and Tracking Spam Prevention plugin get bypassed #24406, it seems like the vast majority of crawlers and and bots are not getting registered by Matomo by default, even without the Tracking Spam Prevention plugin, which is great.

Is it documented somewhere how Matomo distinguishes between real users and crawlers, and how Matomo chooses to register a visit, or not? I am trying to understand why 99% of bot visits are ignored, yet Twitter’s bot somehow gets treated as a regular user.

IP range blocking

About manual IP range blocking, I used to do that, just to save the server from over heating. But the IP-ranges switch, new ones arrive, etc. so I went for other more dynamic solutions, based on “visits over time” ratio throttling by ASN, which works really well.

In any case, the well behaving bots were never counted as real visits, so in terms of Matomo and statistics, there was no problem.

As I see it, the task here is two fold:

  1. Figure out why Twitter/X visits are getting counted as regular visits, whereas Google/Anthropic/Bing/etc. are not. Is it because Twitter/X bots are using extreme levels of cloaking?
  2. Add support for custom blocking of bots, based on signals such as country, resolution, IP-cluster and such. This could serve as a last stop gap. (See Feature request: Exclude by metric)

The topic is many years my interest. There are exists non-detectable bots. Read here my newest infos: Matomo Tracking Bot Filter (PHP)
and here a Bot IP CIDR Blacklist: Bot Blacklist (IP CIDR collection)