On a site I work on, we have a privacy policy that says we will delete all personally identifiable information older than 12 weeks. I’m trying to make sure I’ve done this in Piwik, but I’m a bit confused by the docs.
In this FAQ, it says that the way to purge logs is by turning on the archive.sh script, AND running a custom SQL command [1].
That makes sense to me, but I have archiving set to run once a day, and looking at what the archive code does, it seems like the above SQL is completed by the archive.sh script anyway.
For example, when I look at my logs[2], they go back to the first of each month, and then end. And when I look at the code in archive.sh, it seems to indicate logging for the day, week, and month. And, I don’t have any other commands that would be truncating the logs at the first of each month.
Is this an error in the documentation, or am I missing something? If the archive.sh script is doing what I want it to, then I’m pleased as punch, and we should update the FAQ mentioned above. If not, then I’m really confused.
Help? Thoughts? Developers?
[1] DELETE piwik_log_visit, piwik_log_link_visit_action FROM piwik_log_visit INNER JOIN piwik_log_link_visit_action WHERE piwik_log_visit.idvisit = piwik_log_link_visit_action.idvisit AND visit_server_date <= CURRENT_DATE() - 30;
[2] SELECT visit_server_date FROM piwik_log_visit INNER JOIN piwik_log_link_visit_action WHERE piwik_log_visit.idvisit = piwik_log_link_visit_action.idvisit ORDER BY -visit_server_date;
Still no reply here, though I’ve noticed that 67 people have looked at this post. Any ideas?
My next step will be to file a bug report linking to this post, since I do think this is a documentation bug, but I don’t know the Piwik system nearly as well as others here, so I’d like to avoid putting my foot in my mouth if possible…
I’m still confused then. I am running the archive script, and when I look into my logs with:
SELECT visit_server_date FROM piwik_log_visit INNER JOIN piwik_log_link_visit_action WHERE piwik_log_visit.idvisit = piwik_log_link_visit_action.idvisit ORDER BY -visit_server_date;
I see that they have been truncated at the beginning of the month, which makes it seem like the archive.sh script is doing what the FAQ is saying I must do manually…
I tried looking through the source code to find where the archive script is, but I promptly got lost, since I don’t do PHP normally.
The feature of automatically deleting old older than 7/30/N days is now available in Piwik, under Settings > Privacy > Delete old logs from the database.