Archiving is very slow - possibly relating to UsersFlow plugin?

Hi,
We have an issue where archiving takes many hours on each website we monitor - and seems to be stuck perhaps relating to the UsersFlow plugin. We have archiving scheduled via the command line (actually runs in a container in AWS ECS - but runs the console core:archive command as documented here, just in the container rather than a cron job).

The logs look like this…

A lot of these:

DEBUG [2021-01-06 16:16:16] 134  Found archive with intersecting period with others in concurrent batch, skipping until next batch: [idinvalidation = .... , idsite = 1, period = day(2021-01-04 - 2021-01-04), name = done03b....bf.UsersFlow]

Occasionally interspersed with one of these:

DEBUG [2021-01-06 16:16:44] 134  Found duplicate invalidated archive , ignoring: [idinvalidation = ...., idsite = 1, period = day(2021-01-04 - 2021-01-04), name = done8c90...7b8.UsersFlow]

And, from time to time, a batch of these…

DEBUG [2021-01-06 16:33:05] 134  Found invalidation for segment that does not have auto archiving enabled, skipping: 3...65
DEBUG [2021-01-06 16:33:14] 134  Found invalidation for segment that does not have auto archiving enabled, skipping: 3...48
DEBUG [2021-01-06 16:33:23] 134  Found invalidation for segment that does not have auto archiving enabled, skipping: 3...85

Matomo version is 4.0.5
UsersFlow plugin version is 4.0.2

Any info or help would be much appreciated. Any tips or advice on how to troubleshoot further would be great.
Thanks.

Hi
As a further update to this, I’ve been searching for any info that may help me work out what’s going on with our installation. I’ve come across this: https://github.com/matomo-org/matomo/issues/16689
…which mentions querying the matomo_archive_invalidations table.
So I’ve tried running select status, count(*) from matomo_archive_invalidations group by status;
…to see if that gives any clues. I get this:

status count(*)
0 26892672
1 14

But I’m afraid I don’t know the significance and whether this would cause archiving to run slowly. Does 26 million rows with status=0 require some kind of pruning maybe?

Any clues, help or advice would be much appreciated.
Thanks.

Hmmm, definitely this has some relation to https://github.com/matomo-org/matomo/issues/16689
I’ve looked at the earliest record (idinvalidation=1) in this table and here ts_invalidated=2020-12-03 14:28:52
This was when we upgraded to 4.0.3
We have since upgraded again to 4.0.5 but the issue remains.
:thinking:

Here’s some row counts from the matomo_archive_invalidations table for today…

SELECT count(*) FROM matomo_archive_invalidations where left(ts_invalidated,10) = '2021-01-07' and name LIKE '%UsersFlow%';
Returns 1507301 rows, whereas:

SELECT count(*) FROM matomo_archive_invalidations where left(ts_invalidated,10) = '2021-01-07' and name NOT LIKE '%UsersFlow%';
Returns 23 rows.

:man_shrugging:

Update: we’ve had to remove the UsersFlow plugin - and since doing this, the matomo_archive_invalidations table is now stable. The archiver is able to keep this down to zero rows and archived reports are up to date. Everything appears back to normal.
If anyone has similar experiences with this plugin - or perhaps a better experience - let me know. Thanks.
:thinking:

Hi @Gareth, since version 4 some plugins will schedule archiving past data upon activation. UsersFlow is one of these, and if you have a lot of websites in your Matomo, that means there will be a lot of invalidations created. The behavior is controlled by the [General] rearchive_reports_in_past_last_n_months INI config option. By default it’s set to 6 months, but if you set it to 0, the behavior will be disabled. Can you see if setting this solves the problem? You should be able to activate UsersFlow and it should not increase the amount of invalidations in the table.

1 Like

Hi @Gareth
To understand better, could you let us know how many websites approx are there in your Matomo? How many segments?

Hi @diosmosis and @matthieu,
Thanks for the replies. Yeah, so we have 21 sites at the moment with 6 currently configured. There will be more coming on-line soon - some of those will have considerably more daily visits.
There are 25 segments at the moment.

@diosmosis OK, thanks, that’s useful to know. Actually UsersFlow had been active for some time (since last May) - it began to “fail” or at least the archiving process was failing to keep up in late December. We have archiving scheduled per hour so I guess something caused the invalidations to increase so they couldn’t be cleared within an hour - so the number of rows in the table was never cleared but always increased. Do you know what scenarios could cause that?
We removed the plugin on the weekend - and now the table clears to zero rows after each archive run (which takes about 15 mins).
That rearchive config option sounds like one we’ll enable before we add UsersFlow back in - but will need to find a convenient time to reintroduce it and some method to monitor the effects.

(PS - I’ve added a separate post around automated monitoring here).

Thanks for the replies. Yeah, so we have 21 sites at the moment with 6 currently configured. There will be more coming on-line soon - some of those will have considerably more daily visits.
There are 25 segments at the moment.

Ok, that’s far too few sites and segments to account for millions of invalidations. And if UsersFlow has been activated already it wouldn’t create new ones (unless something is triggering that logic). On deactivation the plugin deletes pending invalidations for the plugin, so clearly something is creating a lot of invalidations just for that plugin. I’m not sure what would do it though, the only other way invalidations would be added is if someone added them manually via the InvalidateReports plugin or the invalidate-reports CLI command (or they were used in some kind of script). Even then, there shouldn’t be any duplicate invalidations inserted.

There is one way I can think of to test this. The invalidations aren’t inserted on demand since that would slow the UI down, we schedule them via an option. So you could:

  1. activate UsersFlow
  2. check the row in the option table with option_name = ‘ReArchiveList’ (there should be one entry for UsersFlow, nothing more)
  3. run core:archive w/ the --force-date-range option set to a single day (this is so the process won’t try to handle every invalidation, it will just archive a single day).
  4. check the invalidations table to see how many invalidations were inserted. For rearchiving the last 6 months, it should add around 211 * 25 * 21 at most (where 211 is the number of day, week, month, year periods for the last six months; the calculation is not exact and not every segment may apply for every site). If it’s abnormally large, then there’s an issue in how we insert those invalidations.
  5. after core:archive is finished, the entry from ReArchiveList should be removed. If it’s still there, then that’s where the problem is.

If nothing comes from these checks, you could monitor the ReArchiveList and the invalidations table. Though honestly, I’m unsure what would cause this problem.

I’ll also be trying to add some diagnostics to help monitor what’s in the invalidations table.

Hi @diosmosis
Thanks for the detailed info. Yes, we’ve been discussing an approach here for reactivating UsersFlow. It’s something we’ll do with some care & monitoring since the failed archiving / missing reports did cause some issues here the other week.
We don’t have the InvalidateReports plugin activated and I don’t believe anyone would have used the console CLI to invalidate them. But I’ll check.
I will try all the steps from your post ASAP - though first I want to set up some more detailed monitoring (perhaps with Prometheus or equivalent) so we can more accurately detect / measure any changes.
I will let you know what we find!
Thanks :+1: