Auto archiving, old logs and database size

We have been using Piwik for a little over two years. We started tracking one website. Over the years, we reached a point where we were tracking seven sites. All the sites were low volume (less than 100 visits/day) except for one. The one “high” volume site averages close to a 1000 visits/day over the course of month.

Our database has grown to about 500MB and I thought that it might be time to think about cleaning up the database to get it down to a more manageable size. As best as I can determine there are three key FAQs that talk to this:

When I enable “Purge old logs” in the Privacy settings, how do I make sure that no historical data is lost?
How do I delete all statistics for a given websites, or for all websites?
How to setup auto archiving of your reports?

If I missed the answers to the questions, below forgive me. If someone can point me in the right direction, I would be grateful.

  1. In the Delete old visitor logs Privacy setting, it says, “When you enable automatic log deletion, you must ensure that all previous daily reports have been processed, so that no data is lost.”

What does that mean? Does that mean that we have to run the archiving script as close as possible to the time that the purge happens? What times does the purge take place? If we run the archiving script a couple of times a day, are we good or do we still run the risk of losing data? Or is that a warning for people who trigger the archiving from a browser?

  1. Our archive script takes 45-60 minutes to run for our nine sites. If we turn off archiving to trigger in the browser and we only run the script once a day, I assume that we will only be to see the reports through the last time the archive script was run (I’ll always be 24 hours behind). Is that right?

  2. Does the archive script re-create (drop/add) the piwik_archive_* tables each time it runs?

Thanks in advance for the help.

  1. you choose hwo often purge is ran, but anyway if you run archive.php once a day it is safe to delete logs since we delete at minimum 7 days old.

  2. up to 24 hours, often less

  3. no it adds new pre-processed reports
    PS: you should use archive.php not archive.sh

Matt, thank you for the quick response again. I tried to run the archive.php, but I received too many errors – I don’t remember if they were 404 or 500 internal server errors – so I tried archive.sh, which ran without error. So, is archive.sh is not an option for auto archiving? Only archive.php will do?