How to change time of day for deleting old logs?

We’re using Piwik 1.7.1 in a somewhat large installation (about 400k actions/week), and of course our database is getting bigger and bigger. Initially we didn’t setup Piwik to delete old logs, and decided to do so before the database grow out of control. We then setup the software to delete logs 6 months old, 100k at a time, on a daily-basis. We started using Piwik on May 2011 so there’s already a lot of log entries worth do delete.

We then noticed that apparently Piwik is bringing the web server to an halt every time it start to clean up the logs. We can see in our monitoring software that a lot of piwik.php requests starts to pile up when the deletion is being run. This make Apache reach its MaxClient setting very quickly, which of course prevent legitimate users to access all sites hosted on the server until the clean up finishes. Not nice.

The actual problem is that Piwik is running the deletions at 9pm local time. This is the time of day that server access is at peak, thus the worst time to do such resource-intensive task. I searched the interface for a way to change the time of day the deletions occur to no avail. The hint that this may be the very cause of the problem is that “Next scheduled deletion in” always show the time remaining to the next 9pm.

Yesterday, before 9pm, I cheated a little by changing the value of the “lastDelete_piwik_logs” key in the options table directly using SQL. The UI did respond to the change (i.e. next scheduled deletion was the time remaining to 3am, since I changed “lastDelete_piwik_logs” do 3am yesterday) and everything went fine (no downtime yesterday). Today at 10pm, I was notified by our monitor software that the web server stopped responding at… ops… 9pm. Looking at monitor reports I saw lots of stuck piwik.php at time the server stopped. Went to the Piwik UI and to my dismay, the “lastDelete_piwik_logs” went back to display the time remaining to 9pm tomorrow, meaning the clean up did run at 9pm today. sigh

(BTW, It would be good if the “Last deletion was on:” also displayed the time of day the deletion took place, instead of only year/month/day.)

So my question is: how to configure Piwik so that old log deletions occur at a specific time of day, say 3am?

Anyone?

Sort of my server is being DOSed every day by Piwik… About 20 minutes of downtime, which is unacceptable. There’s no clean solution apart from hacking the sources?

Thanks for the message. Currently’ it’s not possible to change the time but we could add it. Please create a bug at: dev.piwik.org with the problem you experience and how changing the hour of execution would maybe fix it. we will answer in the ticket thanks!

Hello,

i am having the exacte same problem ;), it has been some time since this report and i am still running on 1.8.4, was the function to change the date/time implemented?

If not, according to the original poster, if i move the time by setting lastDelete_piwik_logs in the database, should this value not be moved forward by adding delete_logs_schedule_lowest_interval so the next interval should be on 3am as well, if lastDelete_piwik_logs was set to 3am.

BTW my time is 7am, how does piwik chose this time?

I set it to 1am now, and i guess i will know next week if it keeps it at 1 am :slight_smile:

Regards Mike

Mike,

I managed to workaround this issue by periodically running a script that resets the next time to run the deletions. The script is run by cron a few hours after the deletion starts, just to give Piwik time to do its jobs. In my case, I set the log deletion to occur at 3am, and run the script at 5h30am.

(BTW: It looks like the cleanup is set up to run at 00:00 UTC. Thats why it runs at 21h on my server, which default timezone is GMT -0300. I bet it chooses 7am for you because your TZ is +0700.)

Below is my good, little, and dirty script. Adjust “$run_at” to the hour of the day you want the deletion to happen. Don’t forget to specify your actual database name, login and password:

p.s.: It works for me, it may not work for you. Use at your own risk!


<?php

$run_at = 3;

$conn = mysql_connect('localhost', 'XXXXXXXX', 'XXXXXXX');
$conn or die();

mysql_select_db('piwik') or die();

$res = mysql_query("select option_value from `option` where option_name = 'TaskScheduler.timetable'");

$a = mysql_fetch_array($res);
$a or die();

$rec = unserialize($a[0]);

$ltime = localtime(time(), TRUE);

if ( $ltime['tm_hour'] >= $run_at )
     $ltime['tm_mday']++;
$ltime['tm_hour'] = $run_at;
$ltime['tm_min'] = $ltime['tm_sec'] = 0;

$utime = mktime($ltime['tm_hour'], $ltime['tm_min'], $ltime['tm_sec'], $ltime['tm_mon'] + 1, $ltime['tm_mday'], $ltime['tm_year'] + 1900);
assert($utime);

$rec['Piwik_PrivacyManager.deleteLogTables'] = $utime;

mysql_query("update `option` set option_value = '" . mysql_real_escape_string(serialize($rec)) . "'  where option_name = 'TaskScheduler.timetable'");

?>

Hello flaviovs,

thank you for the update, you are spot on, we are running the server GMT+7 , so yes piwik sets the time to UTC 0:00

With regards to your script, what version are you using? I am asking because the script you posted is modifying a parameter that does not seam to be in use in 1.8.4

Piwik_PrivacyManager.deleteLogTables is set to 1348012844 = Wed Sep 19 07:00:44 ICT 2012, so that has not been touched since i moved the server to a new home around this time.

I have however Piwik_PrivacyManager.deleteLogData, but that is always set to UTC 0:00 of the current date. The only date that correspondes to the date/time proposed by the purge log module is the option ‘lastDelete_piwik_logs’

Any advice about what version you use and if you had to make any code changes would be much appreciated. Thank you for taking the time to help me out.

Regards Mike

Dear Matt,

any update on a way to set the execution time manually, or a hint to have it set to 0:00 local time instate of UTC?

Regards Mike

Mike,

Yep, the script was from old 1.8.x days. The fact is that once Piwik catch up with deleting very old logs, it didn’t require much of the server afterward, so I didn’t bother to check if the script was still valid.

You may try to change try the variables you proposed, and check on the Piwik interface (check “Last deletion was on” on Setting > Privacy).

In fact, I have just checked my 1.10.1 installation and saw that Piwik is set to run the deletion on 00:00. sigh

I’ll try to take a look on the script tomorrow. In the meantime, we’ll be glad if you post your results.

BTW, for any Piwik developer reading this: we understand that fixing this problem, providing an UI for ir etc., is a lot of work. But could you please at least enlighten us by providing the correct variable that must be set? Thanks.

It would be very helpful to get an answer to this. Where exactly is the time to purge old log data stored, so that we can modified it?

After turning on the option to delete old log data, the setting said: “Next scheduled deletion in: 12 hours”, but 12 hours have passed and no log data has been deleted. How can I debug this if I can’t figure out what day/time Piwik should be deleting old logs?

Where exactly is the time to purge old log data stored, so that we can modified it?

stored as a serialized array in piwik_option table, see the record “TaskScheduler.timetable”

Thanks matt! Are there plans for making this configurable in the future?