[SOLVED] Archiving old data only gets ~10k rows for prev 30 days


#1

Hi there, I’ve loaded over 150K rows of exported goolge analytics page views for the past 13 months. I’ve tried every possible way to get the data to show on the dashboard, but it only shows me the previous 30 days for about 10k rows.

I’ve run the core:archive function with up to --force-all-periods=31557600000 --force-date-last-n=10000 but it still only shows about 15k rows in the log:

INFO CoreConsole[2014-11-10 17:04:25] START
INFO CoreConsole[2014-11-10 17:04:25] Starting Piwik reports archiving…
INFO CoreConsole[2014-11-10 17:04:30] Archived website id = 4, period = day, 15550 visits in last last100000 days, 276 visits today, Time elapsed: 4.680s
INFO CoreConsole[2014-11-10 17:04:32] Archived website id = 4, period = week, 15550 visits in last last100000 weeks, 276 visits this week, Time elapsed: 2.177s
INFO CoreConsole[2014-11-10 17:04:38] Archived website id = 4, period = month, 15550 visits in last last100000 months, 5888 visits this month, Time elapsed: 6.426s
INFO CoreConsole[2014-11-10 17:04:41] Archived website id = 4, period = year, 15550 visits in last last100000 years, 15550 visits this year, Time elapsed: 2.410s
INFO CoreConsole[2014-11-10 17:04:41] Archived website id = 4, 4 API requests, Time elapsed: 15.710s [1/1 done]

I can see the data in the log visits table but not in the UI. I’ve deleted the archive tables, multiple times.

I’ve tried running the archive from the command line and changing the ini parameters to archive on login,

The 150k rows that I am adding all have the same idvisit, config_id, location_ip, location_longitude, city etc… Maybe that is a clue to why it is ignoring the data older than 30 days.

regards,
Royce


#2

I think I figured out the solution. I just started using Piwik at the end of last month so the field ‘ts_created’ in the piwik_sites table had a date of 2014-10-21’. I had to manually change the date to the earliest date of our log files and that forced the archive to load all of our data!

I spent 4 days trying many things to get this to work and I only found it when I resorted to using a debugger and stepping through the code while running the archive :wink:

regards,
Royce


(Matthieu Aubry) #3

Yes good point! Good find also… :slight_smile:


#4

Hi Matt, would you consider changing how this works? Is there a reason why it looks at the created date of the site and not just process all of the data in the visit table?

regards,
Royce