Archiving is very slow after update to Matomo 3.12.0 (Testing 1.13.0-b1)

We removed the custom segments defined on the two large volume sites and the archiving process is still taking days for these sites.

One thing that we have found will exponentially increase archiving time is bot activity combined with segments. If we have a bot hit a high traffic site that generates hundreds of pageviews for a visit (sometimes we see > 1,000 for the types of bots that hit our sites), for some reason the archive process will spike the cpus (2-3x normal) for at least a day or two for that site with segments (until the last2 period clears for the activity in the archive runs and the same time of bot behavior doesn’t recur).

What we do is identify the bot by searching for max actions >= the high number we see on a day’s overview report to find the partial IP, then filter that IP out going forward and will sometimes use the GDPR tools to search and remove that traffic. When we do, an archive run completes normally and within what we usually expect for cpu use. May not be related but though I’d share our experience.

2 Likes

Could be connected to this indexing issue: MariaDB | Slow Query | High CPU Load · Issue #15588 · matomo-org/matomo · GitHub