Manually purging large tables

I am currently using Piwik 1.3 and would like to upgrade to Piwik 1.5.1 in order to get the automatic purging amongst other features.

Unfortunately I have an 8GB Piwik database (3GB _log_visit and 3.6GB _log_link_visit_action tables) on a 10GB filesystem so I fear that there may not be enough space for the schema changes that 1.5.1 will need.

I would like to manually purge the historic data (leaving the last 60 days). Unfortunately the FAQ (Troubleshooting - Analytics Platform - Matomo) has been updated to reference the automatic purging that I don’t currently have.

Can anyone let me know the best way to manually purge the historic data so that I can then embark on the upgrade?

Many thanks in advance,

Neil.

DELETE FROM piwik_log_* WHERE server_time < ‘2011-06-25 00:00:00’

(replacing whatever field holds the DATE in the table you wish to purge)

I’m finding this to be better with respect to the duration of the queries (due to a lack of appropriate indexes on the datetime columns):


SELECT `idvisit` FROM piwik_log_visit WHERE '2011-01-01 00:00:00' > `visit_last_action_time` AND `idvisit` > 0 ORDER BY `idvisit` DESC LIMIT 1

DELETE LOW_PRIORITY QUICK IGNORE FROM piwik_log_conversion WHERE `idvisit` <= ?
DELETE LOW_PRIORITY QUICK IGNORE FROM piwik_log_link_visit_action WHERE `idvisit` <= ?
DELETE LOW_PRIORITY QUICK IGNORE FROM piwik_log_visit WHERE `idvisit` <= ?
DELETE LOW_PRIORITY QUICK IGNORE FROM piwik_log_conversion_item WHERE `idvisit` <= ?

Where ‘?’ is the visit ID returned by the first query.

This is how PrivacyManager handles things. Unfortunately the first query isn’t able to use an index so takes an age to run on a large table. I’ve had a look at the updates between 1.3 and 1.5.1 and can’t see an index being added to visit_last_action_time so this could be a problem for someone upgrading and then enabling the automatic purging. It also means that each time the purging runs, it’s performing a slow query.

It’s an acceptable tradeoff to run a slow query here, since adding an index would reduce performance overall, not worth it

See our new page, which lists the various limits on historical data, number of websites, users, database size, when using Piwik: Data Limits Analytics