Archive.php in Piwik 1.8.2 is running over the same sites when more than one instance is running in parallel


#1

Hi all. We are running this big instance of Piwik (67,000+ sites) on version 1.8.2. We found out it was not correctly configured for this amount of data, so we started running archive.php on the background and stopped the “on the fly” archival process. Now, the problem is that it is taking a long time to run through each site.

So, we tried to launch a new instance of archive.php, but seems to be running over the same sites as our first instance (it skips all previously processed sites, but then starts processing the one that is already being processed by the first instance). Since we are using an old version, this is probably an old bug. We saw this ticket when archive.php is run multiple times, each concurrent run should archive different websites · Issue #3405 · matomo-org/matomo · GitHub but seems to be talking about versions above 2 and we cannot upgrade right now.

We may be able to move to version 1.8.3 but since our db is very big (20+ Gb), moving to 1.8.4 may not be possible because of the schema changes. Is there some fix we can apply to make archive.php work? If not, is there any other way to speed up the archive.php process?

Thanks in advance


(Matthieu Aubry) #2

Upgrade to 2.2.0 ASAP as we have done countless improvements in performance. 301 Moved Permanently

for example: Add possibility to run multiple archiver in parallel · Issue #4903 · matomo-org/matomo · GitHub


#3

Hi rreyes1979

Don’t be too afraid of the schema changes. You can run the update from the commandline and manually apply the db queries through your db management tool - that way even if it takes a while it should work fine.

I am currently running a Piwik instance with a db size of 35GB (28GB of which in log_link_visit_action tables) which I upgraded the manual way last year to version 1.12 and it took less than one hour to apply the schema changes.

Make a backup copy (dump) of your Piwik db first and then give it a try.

However, like Matt wrote, parallel running instances of archive.php aren’t supported prior to Piwik 2.1