How to delete specific month of log data after import from GA


#1

W’ve been migrating a large e-commerce site from Piwik to GA using the google2piwik.py script. I know its no longer supported as of a few weeks ago but there is currently no other tool and the business requires this historical data. We’re importing 5 years of data and so far we’ve had many issues with the script hanging on days with a high spike in traffic (i.e. 500k visits) and other general bugs however we’re slowly progressing by importing the data one month at a time and then verifying everything.

However our most recent import for Dec-2012 has the number of visits doubled in Piwik. We’re not sure how that happened but we need to delete the data from the database and re-import it. We’ve looked at the tables but are not sure exactly how to delete this data. Can anyone give us a sql statement to delete a specific month’s data?


#2

Will the queries below work assuming the month with the duplicates is Dec 2012? We would really prefer not to have to start from scratch as we have already imported 2 years of data which has taken a few days.


delete from piwik_log_visit where visit_last_action_time LIKE '2012-12-01%';
delete from piwik_log_link_visit_action where server_time LIKE '2012-12-01%';

Any help would be GREATLY appreciated!


#3

Is this something we need to get paid support to answer?


(Matthieu Aubry) #4

yes you can safely delete the erroneous data from the piwik_log* tables