A question for archiving


#1

first , thanks to piwik , a very nice product
recently , i have use piwik for my website , i set enable_browser_archiving_triggering = false
but the date in visitors->times->“visit per server time” is not correct
some hour has no data
[attachment 423 23.png]
but if i set always_archive_data_period = 1;
always_archive_data_day = 1;
always_archive_data_range = 1;
and then run archive.sh , the data is correct
[attachment 424 23-1.jpg]
i can not understand , so i need some help , thank you very much


(Matthieu Aubry) #2

It is expected behavior, when you disable browser archiving, then you need to run archive.sh to generate the statistics. See: How to Set up Auto-Archiving of Your Reports - Analytics Platform - Matomo


#3

i have run the archive.sh , it is also the same


#4

I think I have similar problem after updating to 1.6. The only way to archive logs is through browser based triggering. Running archive.sh gives me a result like there is nothing to do:


Starting Piwik reports archiving...


Archiving period = day for idsite = 1...

Archiving period = week for idsite = 1...

Archiving period = month for idsite = 1...

Archiving period = year for idsite = 1...

Archiving for idsite = 1 done!

[...]

Archiving for idsite = 2 done!

[...]

Archiving period = year for idsite = 3...

Archiving for idsite = 3 done!
Reports archiving finished.
---------------------------
Starting Scheduled tasks...

No data available
Finished Scheduled tasks.


Browser based triggering works all right, but turning it off and relying on an archive cron job will fail. What is more strange - cron job processing in short time (maybe a day) will change all last values to zeros. I am running automatic log purge and archive.sh every hour.


(Matthieu Aubry) #5

Please check your error logs, I suspect you will find some errors since the archive.sh output looks empty for some websites.


#6

matt, thanks for your reply. Please notice, that I have edited and shortened ([…]) the above archive.sh output - it’s the same for all three sites. It looks like there is nothing to archive (no XML summary). The script seems to work all right and is finishing its job without any warnings or errors. I have checked php error logs, but there is nothing about Piwik archiving script.

Before updating from 1.5.0 (I think) to the latest version everything was working perfectly. When updating, I run auto-upgrade from browser to download sources, executed database upgrade script from shell, added PHP_BIN variable to archive.sh (as usual) and after that operation archiving script refused to cooperate with me any more*.

Of course, switching to browser based archive triggering “solves” the problem.

*I am not sure, so do not take it as a serious trail, but I think that the very first run of archive.sh - directly after update - took a lot of time (as always) and gave normal output (XML).

P.S./EDIT: I think that my problem may be similar also to this one: 301 Moved Permanently

P.P.S. Automatic log purge job seems to work (thanks for that :)), but after its finishing two last periods in any graph are zeroed.


#7

so strange … tco_tm’s problem seems like mine , but my problem is just at 23 hour period , it happened last night ,only 23 hour period has no data
[attachment 428 23.jpg]

and i also have another question :
scheduled_tasks_min_interval = 3600 in global.ini.php
is there any relation with
[attachment 429 23-1.jpg]
??


(Matthieu Aubry) #8

Does the “Visitor log” work for the recent days?

Do you see entries in log_visit table?

Do you see entries for the recent days in piwik_numeric_2011_11 ?

It should work but indeed might be same problem as the other post.


(Matthieu Aubry) #9

[quote=moviethief]
and i also have another question :
scheduled_tasks_min_interval = 3600 in global.ini.php
is there any relation with
[attachment 429 23-1.jpg]
??[/quote]

when the UI is used, then the config file is not used anymore and is deprecated, so you can remove it and use UI only

As for your hour 23 last night, this is strange indeed. Do you see these visitors in the visitor log ? in piwik_log_visit table in the database? Be aware that times are in UTC. Also, in piwik times are per the timezone of your website, which might be one hour different from what you expected?


#10

I think now, that my problem probably is more similar to the other topic (not archiving at all rather than not showing specific hour traffic) - sorry moviethief. Should I continue my posting there?

[quote=matt]
Does the “Visitor log” work for the recent days?
Do you see entries in log_visit table?
Do you see entries for the recent days in piwik_numeric_2011_11 ?[/quote]
What/where is “Visitor log” (oops :S :smiley: )?
My “piwik_log_visit” table is normally filled with new data (so browser triggered archiving can use it).
My “piwik_numeric_2011_11” table is not updated by archive.sh script in any way, but browser triggered job adds there new records.


#11

thanks for your replies , matt and tco_tm , :slight_smile:
for matt:
Does the “Visitor log” work for the recent days? yes…
Do you see entries in log_visit table? yes…
Do you see entries for the recent days in piwik_numeric_2011_11 ? yes…

the entries are in the table , and the data in all tables are correct ,it just didn’t show 23 hour
i am in china , my timezone is utc+8 , is there any relation between 23hour and utc ?
when i set always_archive_data_period = 1; always_archive_data_day = 1; always_archive_data_range = 1; in global.ini.php
then run the archive.sh , 23hour data showes , it is back to the first floor
i will check data tomorrow to see this night 23hour is correct or not.

for tco_tm:
you are welcome , we all take problem with archiving ,maybe the same way to resolve them :slight_smile:


(Matthieu Aubry) #12

tco_tm, Visitor Log is Real Time Analytics - Analytics Platform - Matomo

So you are using the archive.sh script? can you try using the archive.php (in the same folder)?


#13

Thanks matt. I think that my “Visitor Log” is showing proper and current data (despite it lacks some css styling when compared to official demo - strange).

I tried archive.php and it looks that it is processing logs (hurray), below is the output. So, should I switch from archive.sh to archive.php?

[2011-11-23 08:45:35] [c3a90d58] [8.24 Mb] NOTE: in 'reset' mode, we will process all websites with visits in the last 7 days 0 hours
[2011-11-23 08:45:36] [c3a90d58] [8.37 Mb] ---------------------------
[2011-11-23 08:45:36] [c3a90d58] [8.37 Mb] INIT
[2011-11-23 08:45:36] [c3a90d58] [8.37 Mb] Querying Piwik API at: http://dev.mazdaspeed.pl/stats/index.php
[2011-11-23 08:45:36] [c3a90d58] [8.51 Mb] Running as Super User: klub
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] Notes
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] - Reports for today will be processed at most every 1800 seconds. You can change this value in Piwik UI > Settings > General Settings
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] - Reports for the current week/month/year will be refreshed at most every 3600 seconds
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] Will process 3 websites with new visits since 7 days 0 hours , IDs: 1, 2, 3
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] Segments to pre-process for each website and each period: none
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] ---------------------------
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] START
[2011-11-23 08:45:36] [c3a90d58] [8.52 Mb] Starting Piwik reports archiving...
[2011-11-23 08:45:36] [c3a90d58] [8.54 Mb] Archived website id = 1, period = day, Time elapsed: 0.502s
[2011-11-23 08:45:37] [c3a90d58] [8.55 Mb] Archived website id = 1, period = week, 360357 visits, Time elapsed: 1.057s
[2011-11-23 08:45:40] [c3a90d58] [8.55 Mb] Archived website id = 1, period = month, 1231509 visits, Time elapsed: 3.004s
[2011-11-23 08:45:47] [c3a90d58] [8.55 Mb] Archived website id = 1, period = year, 1231509 visits, Time elapsed: 6.997s
[2011-11-23 08:45:47] [c3a90d58] [8.53 Mb] Archived website id = 1, today = 108 visits, 4 API requests, Time elapsed: 11.561s [1/3 done]
[2011-11-23 08:45:47] [c3a90d58] [8.54 Mb] Archived website id = 2, period = day, Time elapsed: 0.163s
[2011-11-23 08:45:50] [c3a90d58] [8.55 Mb] Archived website id = 2, period = week, 3829413 visits, Time elapsed: 2.531s
[2011-11-23 08:46:22] [c3a90d58] [8.55 Mb] Archived website id = 2, period = month, 10872906 visits, Time elapsed: 32.452s
[2011-11-23 08:47:00] [c3a90d58] [8.54 Mb] ERROR: Got invalid response from API request: http://dev.mazdaspeed.pl/stats/index.php?module=API&method=VisitsSummary.getVisits&idSite=2&period=year&date=last52&format=php&token_auth=[...]&trigger=archivephp. Response was ''
[2011-11-23 08:47:00] [c3a90d58] [8.54 Mb] Archived website id = 2, period = year, 0 visits, Time elapsed: 38.073s
[2011-11-23 08:47:00] [c3a90d58] [8.54 Mb] Archived website id = 2, today = 1275 visits, 4 API requests, Time elapsed: 73.219s [2/3 done]
[2011-11-23 08:47:01] [c3a90d58] [8.58 Mb] ERROR: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

Fatal error: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away in /var/www/pl.mazdaspeed.dev/http/stats/misc/cron/archive.php on line 179



(Matthieu Aubry) #14

Sounds good! archive.php is in beta but will eventually replace archive.sh — see New optimized archive.php script for faster and optimized archiving when hundreds/thousands of websites · Issue #2327 · matomo-org/matomo · GitHub


#15

i think i have got the reason why 23hour has no data
i have changed piwik import link from web request to apache log , because js make a lot of infomation to apache access log
i import last hour data in this hour , and run archive.sh at next hour 1 min , so at 0 am , the script runs last day 22hour data , when 1am , script think last day had done , so 23hour is zero