Deleting old logs apparently not working

We’re using piwik 1.12 and had quite a large DB (over 30GB ). I finally got a chance to manually cleanup the logs by running queries like

DELETE log_visit, log_link_visit_action FROM log_visit INNER JOIN log_link_visit_action WHERE log_visit.idvisit = log_link_visit_action.idvisit AND visit_first_action_time <= “2013-02-04”;

which took quite a while, and then running OPTIMIZE on log_visit & log_link_visit_action which took each time nearly two hours.

The database size is now down to 24GB and I’ve set up a daily run of “delete old visitor logs” to delete logs more than 365 days old. This should have run some time late last night / early this morning (as I type it’s 09:40 and the admin now says that the next scheduled deletion is in 14 hours 24 min) but it doesn’t appear to have run because running

select min(visit_first_action_time) from log_visit;

gives me

±-----------------------------+
| min(visit_first_action_time) |
±-----------------------------+
| 2013-02-05 00:00:14 |
±-----------------------------+

Note that my last manual deletion deleted everything <= “2013-02-04”

What causes the piwik scheduled deletion to run when it should? We are running Piwik as a replacement for Urchin so we use no real time tracking - it only post processes log files, which is done daily at midnight after rotating the apache logs. We run import_logs.py for each of 6 websites, and then run archive.php --force-all-websites --force-all-periods.

So, if something in the real time tracking is used to kick off scheduled deletion of old log data, that might explain why it apparently didn’t happen.

Please try with 2.1 RC 301 Moved Permanently

we’ve made many improvements since your version and this may work

While I would like to upgrade to a newer version, I don’t want to do that immediately, so I had decided I’d turn off the “delete old visitor logs” option in Piwik and just run the above SQL query daily.

However, I hadn’t scheduled this yet and I just now had a look at the DB and saw that it appears that piwik’s scheduled deletion has worked, as min(visit_first_action_time) from log_visit is now a year ago. But what struck me much more is the huge reduction database size - down from 24GB to 3GB.

It seems that the the “delete old visitor logs” option in Piwik does rather more than the manual query I mentioned above.

I’m a little concerned that I may have deleted more than I really wanted to, though I do NOT have “Regularly delete old reports from the database” ticked , but I am able to go back and view old reports, so it would seem OK.

Can I be sure that I really didn’t need that 20 or so GB of data ?

I cant say if you’re using old version because there could be bug… we can only support latest version and even please use latest RC: 301 Moved Permanently