I was wondering what will happen if we import logfiles with the python script import_logs.py and there is also a scheduled task running that is cleaning old visitor logs.

The cleanup of old logfiles takes about a week to finish here. We have about 1 billion records in our table piwik_log_link_visit_action. This is for 45 days. We could reduce another 10 days, but not more.

So for now, we are importing logfiles, archiving them, and once a month a cleanup. When the cleanup is running we wait/delay importing and archiving.

So is it possible to continue import new logfiles while the cleanup schedule is still running? This will speed up our archiving a lot, and customers don’t have to wait for a week once a month.

Anybody tested this or has an idea?

Ok, now I am very disappointed. I have helped a lot. Did a lot of beta testing and running multiple big archives for beta testing to help Piwik. Not asking anything in return.

I think I will stop this, because the feedback and support from Piwik is below average. I would expect after 1 month at least one reply. To bad…

But lucky for us, we have solved this issue, and we have made a fix for Piwik which will reduce the cleanup time by about 60%! So for now, we have solved this issue.

And because of the lack of response of Piwik, we have decided not to share the fix for now. To bad for anybody out here…


This is a community forum that is run by volunteers, who answer questions on their own time.
I understand that it’s unfortunate that your post didn’t yield any reactions, but I think that this is mostly because your issue affects only a small percentage of users (with large instances) and as a result most people (including me) simply can’t help because we have no experience with maintaining a large server instance.

Yeah, I have contributing for the past 5 years! I have done my share already. Any how, I am still surprised of the low response. Sounds like this is a secret.

But here is the fix! Improved archive cleanup with 60% in time! by theyosh · Pull Request #11988 · piwik/piwik · GitHub

Have fun with it. We are running this in production and do not see any issues.

