Pruning Archive Tables


(timtrinidad) #1

This was touched upon at forum.piwik.org/index.php?showtopic=954 but never answered:

Is there any reason old data is not removed from the archive tables, especially after the month is over (and theoretically data from it should change)? We have old archives (with lower idarchive values) that are being ignored due to fresher archive data, but is still kept and is taking up DB space.

Thanks in advance,
Tim


(Gus) #2

Hi,

Data from piwi_log_visit can be deleted the day after their creation. Piwik has not yet planned to perform this task, but it should not delay in coming versions.

Archiving records the data day by day, so you can manually delete your old logs.


(timtrinidad) #3

That should help. I was more worried about the archive_* tables, specificially the archive_blob_* tables. If we’re auto-archiving 2 times a day with 40MB of blobs being generated each run, each monthly blob table will end up at around 240MB. The problem is that we end up with a number of rows of stale data (e.g. same name/date/period combo, only the latest of which is actually used).

Below is a table of row counts for “VisitTime_serverTime”.

name                   COUNT(*)   date1     period
VisitTime_serverTime     1     2010-04-01     1
VisitTime_serverTime     146    2010-04-01     3
VisitTime_serverTime     1     2010-04-02     1
VisitTime_serverTime     1     2010-04-03     1
VisitTime_serverTime     1     2010-04-04     1
VisitTime_serverTime     1     2010-04-05     1
VisitTime_serverTime     1     2010-04-05     2
VisitTime_serverTime     2     2010-04-06     1
VisitTime_serverTime     1     2010-04-07     1
VisitTime_serverTime     1     2010-04-08     1
VisitTime_serverTime     2     2010-04-09     1
VisitTime_serverTime     1     2010-04-10     1
VisitTime_serverTime     1     2010-04-11     1
VisitTime_serverTime     2     2010-04-12     1
VisitTime_serverTime     2     2010-04-13     1
VisitTime_serverTime     3     2010-04-14     1
VisitTime_serverTime     1     2010-04-15     1
VisitTime_serverTime     2     2010-04-16     1
VisitTime_serverTime     1     2010-04-17     1
VisitTime_serverTime     1     2010-04-18     1
VisitTime_serverTime     2     2010-04-19     1
VisitTime_serverTime     2     2010-04-20     1
VisitTime_serverTime     7     2010-04-21     1

(Matthieu Aubry) #4

Could you please try? The algorithm deleting old archives have changed in 0.6-rc1. It will only work on future data. It deletes out of date archives every 24h. Please report in the 0.6-rc1 if it works fine for you, thanks


(hoapq) #5

[quote=timtrinidad @ Apr 22 2010, 01:34 PM]That should help. I was more worried about the archive_* tables, specificially the archive_blob_* tables. If we’re auto-archiving 2 times a day with 40MB of blobs being generated each run, each monthly blob table will end up at around 240MB. The problem is that we end up with a number of rows of stale data (e.g. same name/date/period combo, only the latest of which is actually used).

Below is a table of row counts for “VisitTime_serverTime”.

name                   COUNT(*)   date1     period
VisitTime_serverTime     1     2010-04-01     1
VisitTime_serverTime     146    2010-04-01     3
VisitTime_serverTime     1     2010-04-02     1
VisitTime_serverTime     1     2010-04-03     1
VisitTime_serverTime     1     2010-04-04     1
VisitTime_serverTime     1     2010-04-05     1
VisitTime_serverTime     1     2010-04-05     2
VisitTime_serverTime     2     2010-04-06     1
VisitTime_serverTime     1     2010-04-07     1
VisitTime_serverTime     1     2010-04-08     1
VisitTime_serverTime     2     2010-04-09     1
VisitTime_serverTime     1     2010-04-10     1
VisitTime_serverTime     1     2010-04-11     1
VisitTime_serverTime     2     2010-04-12     1
VisitTime_serverTime     2     2010-04-13     1
VisitTime_serverTime     3     2010-04-14     1
VisitTime_serverTime     1     2010-04-15     1
VisitTime_serverTime     2     2010-04-16     1
VisitTime_serverTime     1     2010-04-17     1
VisitTime_serverTime     1     2010-04-18     1
VisitTime_serverTime     2     2010-04-19     1
VisitTime_serverTime     2     2010-04-20     1
VisitTime_serverTime     7     2010-04-21     1

[/quote]

I try to understand piwik but i don’t have database because i install it in localhost. can you share with me.

Please reply as soon as possible


(Matthieu Aubry) #6

The feature of automatically deleting old older than 7/30/N days is now available in Piwik, under Settings > Privacy > Delete old logs from the database.

This is available in the latest 1.5 RC release, check it out now and report if you have suggestions, directly in this post: 301 Moved Permanently