Purging old logs manually


#1

I just recently activated automatic log purging for my piwik installation with a 5.5 GB database. I was wondering why it’s not really working, and the database size isn’t decreasing. Than I realized it’s because “only” a maximum of 1 million rows is being purged at a time. However, I have more than 40 million rows in the database. So it will take quiet a long time until the database is really, effectively purged.

is there any way I can speed up this process? Maybe some shell script for example? Or can someone provide the mysql queries to purge all at once?

Also, I was wondering how automatic log purging is actually triggered?


#2

FWIW, I setup the log tables log_visit and log_link_visit_action with mysql table partitions, so I just drop the partitions I no longer need. This is pretty fast :slight_smile:
(I have automatic log purging disabled)


#3

Thanks for the tip! Could you be a bit more precise on how to set this up?


#4

DISCLAIMER: only try this if you know what you are doing, and don’t blame me if it doesn’t work for you, deletes all your piwik data etc.

my process of migrating from single file tables to the partitioned ones was something like this:

[ol]
[li] create new tables with _new appended to the table name
[/li][li] dump data i want to keep into outfile
[/li][li] load infile the dump into the _new tables (note: having indexes already set made this faster then adding them afterwards)
[/li][li] repeat steps 2) and 3) with the new data added since the initial dump (initial dump/import took hours for me - high traffic sites)
[/li][li] take piwik offline
[/li][li] rename tables so piwik will use the partitioned ones
[/li][li] take piwik online
[/li][/ol]

step 6 is very quick on innodb, don’t know if it would be slow on myisam!

I created the log_visit table with this:


PARTITION BY LIST (MONTH(visit_first_action_time))
(PARTITION m01 VALUES IN (1) ENGINE = InnoDB,
 PARTITION m02 VALUES IN (2) ENGINE = InnoDB,
 PARTITION m03 VALUES IN (3) ENGINE = InnoDB,
 PARTITION m04 VALUES IN (4) ENGINE = InnoDB,
 PARTITION m05 VALUES IN (5) ENGINE = InnoDB,
 PARTITION m06 VALUES IN (6) ENGINE = InnoDB,
 PARTITION m07 VALUES IN (7) ENGINE = InnoDB,
 PARTITION m08 VALUES IN (8) ENGINE = InnoDB,
 PARTITION m09 VALUES IN (9) ENGINE = InnoDB,
 PARTITION m10 VALUES IN (10) ENGINE = InnoDB,
 PARTITION m11 VALUES IN (11) ENGINE = InnoDB,
 PARTITION m12 VALUES IN (12) ENGINE = InnoDB)

and the log_link_visit_action with:


PARTITION BY LIST (MONTH(server_time))
(PARTITION m01 VALUES IN (1) ENGINE = InnoDB,
 PARTITION m02 VALUES IN (2) ENGINE = InnoDB,
 PARTITION m03 VALUES IN (3) ENGINE = InnoDB,
 PARTITION m04 VALUES IN (4) ENGINE = InnoDB,
 PARTITION m05 VALUES IN (5) ENGINE = InnoDB,
 PARTITION m06 VALUES IN (6) ENGINE = InnoDB,
 PARTITION m07 VALUES IN (7) ENGINE = InnoDB,
 PARTITION m08 VALUES IN (8) ENGINE = InnoDB,
 PARTITION m09 VALUES IN (9) ENGINE = InnoDB,
 PARTITION m10 VALUES IN (10) ENGINE = InnoDB,
 PARTITION m11 VALUES IN (11) ENGINE = InnoDB,
 PARTITION m12 VALUES IN (12) ENGINE = InnoDB)

then when I want purge a month drop the partition for the month and create a new(empty) one (eg October):


ALTER TABLE traffic_log_visit DROP PARTITION m10;
ALTER TABLE traffic_log_visit ADD PARTITION (PARTITION m10 VALUES IN (10) ENGINE = InnoDB);

ALTER TABLE traffic_log_link_visit_action DROP PARTITION m10;
ALTER TABLE traffic_log_link_visit_action ADD PARTITION (PARTITION m10 VALUES IN (10) ENGINE = InnoDB);


(Matthieu Aubry) #5

Thanks for the report. I created a ticket: Delete logs should have option to delete all logs · Issue #2792 · matomo-org/matomo · GitHub


#6

Thanks metadude, thanks matt!

I did figure out now how to purge the logs manually by direct mysql query. For each table, I just left about 1M rows. Hope I didn’t broke anything, but it didn’t look like that.

However, the queries took quiet some time. For "piwik_log_link_visit_action " it took more than 30 minutes. That can’t be done without a cron script like the archive.sh one, I guess. So just including the option “all” in the setting won’t really help, unless you create such a script.

Oh, and while you are at it, please also include the table “piwik_log_action” in purging. It’s now by far the biggest table, and there is no option to purge it. Or is there any safe way to do this manually, without losing any statistical data?