Does archiving utilise a cache?

mclements · December 9, 2020, 3:49pm

Hi there,

We’ve been having some issues with the archiving being really slow, and just want to question whether a part of this is due to archiving making use of a cache, which is something that wouldn’t work with our current setup.
This may not be the cause of our issue but is something that we could improve if there is a cache.

Our archiving runs in a container that is spun up once an hour on its own EC2 instance, when complete this container is removed with a new one created at the next hour. Of course this means if there was a cache, it wouldn’t be used as the container is recreated fresh.

The reason I question about the cache is because the following is seen when running archiving with the -vvv flag, with it mentioning General tracker cache was re-created:

DEBUG [2020-12-09 10:17:11] 169  Earliest created time of segment 'visitorType==returning' w/ idSite = 1 is found to be 2020-09-02. Latest edit time is found to be 2020-09-02.
DEBUG [2020-12-09 10:17:11] 169  process_new_segments_from set to beginning_of_time or cannot recognize value
DEBUG [2020-12-09 10:17:11] 169  General tracker cache was re-created.

This may be totally unrelated and I don’t think it’s the issue with our archiving but of course if a cache is used, it would be more efficient to make use of this rather than start with a fresh container each time. This is a pretty new Matomo implementation so it will definitely need tweaking to get it perfect.

Is this approach of spinning up a new fresh container each hour something that would work?

Thanks!