Since I want to track Google and Bing crawler access times and counts, I installed the BotTracker app.
Since Matomo v4 it does not work OOTB anymore, since a robots.txt
is contained now which denies access for bots: https://github.com/matomo-org/matomo/issues/17497
When removing that robots.txt
manually, bots are generally able to be tracked, which works for e.g. AhrefsBot, Baiduspider and YandexBot.
Google and Bing successfully load the piwik.js
/matomo.js
, but then fail to do the tracker endpoint request. The Google mobile friendly test tool reveals it: https://search.google.com/test/mobile-friendly
It says:
Page partially loaded
Not all page resources could be loaded. This can affect how Google sees and understands your > page. Fix availability problems for any resources that can affect how Google understands your page.
1 page resource couldn’t be loaded
Resource Type Status https://our.domain.com/matomo/piwik.php?action_name=<site_title>&idsite=1&rec=1&r=525912&h=7&m=3&s=1&url=<site_url>%2F&_id=d0bed9fe69afde68&_idn=1&_refts=0&send_image=0&cookie=1&res=412x732&pv_id=DAfyXy&pf_net=0&pf_srv=2&pf_tfr=0&pf_dm1=20
Other Other error
The request is correct, site title and site URL correctly URL-encoded, when manually doing this request in browser, I can see it it Matomo.
There are no webserver or PHP errors, no Matomo tracker errors or log entries about it, like if the request is not done at all.
Does anyone have an idea what the issue is and if something can be done about it? Probably Google and Bing crawlers deny to perform requests with too long query strings or certain characters?