Duplicate entries in archive table


#1

Since I activated the auto archive script I saw my database growing up… I noticed that the archive_blob_2011_01 and the archive_numeric_2011_01 seems to grow up like it’s filled with duplicated data at every auto archiving…
I also activated the purge of old log_visit entries, now the log starts at 30 March, so why Piwik continue to duplicate (only) 2011_01 entries??
What can I do to fix this problem?
Thank you


(Matthieu Aubry) #2

duplicate records in archive_* tables should be cleaned up every few days. If you see older records, maybe there is a problem.


#3

For other months wich logs have already been purged from log_visit (2010_12, 2011_02) the last archiving operation is on 30 July, that’s the date when I ran for the first time the archive.sh script.
But for 2011_01 I see other data archived after that date, starting at 27 September, and new data is added at every auto archiving. I didn’t made any change myself to server configuration (I’m on shared hosting) or to Piwik, so I have no idea why it’s duplicating data only for that month and where it’s taking that data since log_visit doesn’t have any old data from January…
I’m going to manually delete all records in 2011_01 archive tables that where created after 30 July, is there something else I can purge to make sure Piwik doesn’t duplicate data anymore?

Thank you.

EDIT
Looking at last records in archive_numeric_2011_01 I see that values of nb_visits and nb_actions for period=4 and start date 31/12/2010 and end date 1/1/2011 are growing by 2 or 3 at every archive operation… that’s why that table is growing. However I checked in log_visit that there’s no entry with first_action or last_action = 1/1/2011… ??


#4

Maybe can be related: I noticed that I started to have the “memory exhausted” error when the script try to archive the year period.
I tried to run the new php script, but I also get the error (so if one of the purposes of the new php script was to avoid the memory exhausted error, it doesn’t work…).
However, one more time, I cannot figure out why this error (if my problem is related to that) should add new visits to 1/1/2011 day.


X-Powered-By: PHP/5.2.16
Content-type: text/html

[2011-10-23 12:10:01] [f7a66ef9] [8.56 Mb] NOTE: in 'reset' mode, we will process all websites with visits in the last 7 days 0 hours
[2011-10-23 12:10:01] [f7a66ef9] [8.70 Mb] ---------------------------
[2011-10-23 12:10:01] [f7a66ef9] [8.70 Mb] INIT
[2011-10-23 12:10:01] [f7a66ef9] [8.70 Mb] Querying Piwik API at: xxxx
[2011-10-23 12:10:03] [f7a66ef9] [8.84 Mb] Running as Super User: xxxxx
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] Notes
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] - Reports for today will be processed at most every 1800 seconds. You can change this value in Piwik UI > Settings > General Settings
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] - Reports for the current week/month/year will be refreshed at most every 6200 seconds
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] Will process 1 websites with new visits since 7 days 0 hours , IDs: 1
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] Segments to pre-process for each website and each period: none
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] ---------------------------
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] START
[2011-10-23 12:10:03] [f7a66ef9] [8.85 Mb] Starting Piwik reports archiving...
[2011-10-23 12:10:03] [f7a66ef9] [8.87 Mb] Archived website id = 1, period = day, Time elapsed: 0.730s
[2011-10-23 12:10:06] [f7a66ef9] [8.89 Mb] Archived website id = 1, period = week, 19437 visits, Time elapsed: 2.262s
[2011-10-23 12:10:10] [f7a66ef9] [8.89 Mb] Archived website id = 1, period = month, 19448 visits, Time elapsed: 4.004s
[2011-10-23 12:10:11] [f7a66ef9] [8.88 Mb] ERROR: Got invalid response from API request: xxxx?module=API&method=VisitsSummary.getVisits&idSite=1&period=year&date=last52&format=php&token_auth=xxxx&trigger=archivephp. Response was '<br /> <b>Fatal error</b>:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 8208 bytes) in <b>/xxxx/piwik/core/DataTable.php</b> on line <b>1022</b><br /> '
[2011-10-23 12:10:11] [f7a66ef9] [8.88 Mb] Archived website id = 1, period = year, 0 visits, Time elapsed: 1.704s
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Archived website id = 1, today = 32 visits, 4 API requests, Time elapsed: 8.701s [1/1 done]
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Done archiving!
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] ---------------------------
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] SUMMARY
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Total daily visits archived: 32
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Archived today's reports for 1 websites
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Archived week/month/year for 1 websites. 
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Skipped 0 websites: no new visit since the last script execution
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Skipped 0 websites day archiving: existing daily reports are less than 1800 seconds old
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Skipped 0 websites week/month/year archiving: existing periods reports are less than 6200 seconds old
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Total API requests: 4
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] done: 1/1 100%, 32 v, 1 wtoday, 1 wperiods, 4 req, 8702 ms, 1 errors. eg. 'Got invalid response from API request: xxxx?module=API&method=VisitsSummary.getVisits&idSite=1&period=year&date=last52&format=php&token_auth=xxxx&trigger=archivephp. Response was '<br /> <b>Fatal error</b>:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 8208 bytes) in <b>/xxxx/piwik/core/Da'
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Time elapsed: 8.702s
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] ---------------------------
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] SCHEDULED TASKS
[2011-10-23 12:10:11] [f7a66ef9] [8.87 Mb] Starting Scheduled tasks... 
[2011-10-23 12:10:12] [f7a66ef9] [8.87 Mb] done
[2011-10-23 12:10:12] [f7a66ef9] [8.86 Mb] ---------------------------
[2011-10-23 12:10:12] [f7a66ef9] [8.86 Mb] SUMMARY OF ERRORS
[2011-10-23 12:10:12] [f7a66ef9] [8.86 Mb] Error: Got invalid response from API request: xxxx?module=API&method=VisitsSummary.getVisits&idSite=1&period=year&date=last52&format=php&token_auth=xxxx&trigger=archivephp. Response was '<br /> <b>Fatal error</b>:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 8208 bytes) in <b>/xxxx/piwik/core/Da
[2011-10-23 12:10:12] [f7a66ef9] [8.86 Mb] 1 total errors during this script execution, please investigate and try and fix these errors
[2011-10-23 12:10:12] [f7a66ef9] [8.86 Mb] ERROR: 1 total errors during this script execution, please investigate and try and fix these errors. First error was: Got invalid response from API request: xxxx?module=API&method=VisitsSummary.getVisits&idSite=1&period=year&date=last52&format=php&token_auth=xxxx&trigger=archivephp. Response was '<br /> <b>Fatal error</b>:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 8208 bytes) in <b>/xxxx/piwik/core/Da
<br />
<b>Fatal error</b>:  1 total errors during this script execution, please investigate and try and fix these errors. First error was: Got invalid response from API request: xxxx?module=API&method=VisitsSummary.getVisits&idSite=1&period=year&date=last52&format=php&token_auth=xxxx&trigger=archivephp. Response was '<br />
<b>Fatal error</b>:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 8208 bytes) in <b>/xxxx/piwik/core/Da in <b>/xxxx/piwik/misc/cron/archive.php</b> on line <b>179</b><br /> 


#5

Update:
I’ve requested my hoster to change php memory limit to 128 MByte, now the auto-archiving script works.

However I saw new data filled in in 2011_01 archive as before. BUT now I saw also a “strange” table, 2010_01: that was strange because in Jan 2010 my site didn’t exist… I realized that the new data filled in into 2010_01 and 2011_01 archive tables isn’t related to monthly statistic, but to year statistic! (maybe clear to developers, but not to me)
That was why my 2011_01 table was filled up at every cron job. I think that the memory exhausted error caused the script to fill in the table with the new data, but it crashed before clean up the old records. I hope that with the new memory limit this doesn’t happen again, I will check tomorrow.


(Matthieu Aubry) #6

2011_01 table is used to store the “Yearly” reports so it’s expected that it is there even though your site was created later on.


#7

I just checked and now my Piwik is working well.
As a note for developers for tracking down the infamous “php memory exceeded” error: maybe can be interesting to note that the error is triggered between the write of new yearly data and the deletion of old yearly data. In my case yearly statistics were correctly elaborated and added to the 2011_01 archive tables, but the memory was exhausted before deleting the relative old rows, causing the table to grow up at every cron job.
Maybe the memory can be freed up after writing the new data into tables and before the query for deleting the old data?