Ok So I made the following test
I run the command
/usr/bin/python /var/www/html/piwik/misc/log-analytics/import_logs.py --url=https://w3stat.unil.ch/piwik/ /var/tmp/stats/app/access_test --idsite=564 --config=/var/www/html/piwik/config/config.ini.php --recorders=2 --log-hostname=www3.unil.ch --hostname=www3.unil.ch --enable-static --enable-bots --enable-http-errors --enable-http-redirects --enable-reverse-dns --strip-query-string --output=/var/log/piwik/test.out
You can have a look to the results for this site on our piwik site : https://w3stat.unil.ch/piwik using piwik/debug4piwik as user/pwd.
You wil see that the piwik results are wrong both for the visitor log ( some IP are ignored) and the actions > pages report.
I am realli confused about that…
I have the same results for all my parsed logfiles. They all come from an apache webserver with combined ( ncsa… ) format…
find below the content of the acces_test logfile
46.229.160.208 - - [22/Jan/2014:00:08:50 +0100] “GET /wpmu/dalai-lama/files/2013/04/dalai_lama_fr.pdf HTTP/1.1” 200 5754305 “-” "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko"
83.139.189.139 - - [22/Jan/2014:00:08:53 +0100] “GET /wpmu/alumnil/tag/technology/ HTTP/1.0” 200 30463 “http://www3.unil.ch/wpmu/alumnil/tag/technology/” "Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20100101 Firefox/17.0"
83.139.189.139 - - [22/Jan/2014:00:08:54 +0100] “GET /wpmu/alumnil/participez-a-la-construction-dun-nouvel-avenir-technologique-et-social/ HTTP/1.0” 200 34367 “http://www3.unil.ch/wpmu/alumnil/participez-a-la-construction-dun-nouvel-avenir-technologique-et-social/” "Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20100101 Firefox/17.0"
83.139.189.139 - - [22/Jan/2014:00:08:55 +0100] “GET /wpmu/alumnil/participez-a-la-construction-dun-nouvel-avenir-technologique-et-social/index.php HTTP/1.0” 301 - “http://www3.unil.ch/index.php” "Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20100101 Firefox/17.0"
83.139.189.139 - - [22/Jan/2014:00:08:55 +0100] “GET /wpmu/alumnil/participez-a-la-construction-dun-nouvel-avenir-technologique-et-social/index.php HTTP/1.0” 301 - “http://www3.unil.ch/index.php” "Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20100101 Firefox/17.0"
83.139.189.139 - - [22/Jan/2014:00:08:56 +0100] “GET /wpmu/alumnil/participez-a-la-construction-dun-nouvel-avenir-technologique-et-social/index.php HTTP/1.0” 301 - “http://www3.unil.ch/index.php” "Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20100101 Firefox/17.0"
150.82.33.22 - - [22/Jan/2014:00:09:05 +0100] “GET /wpmu/pgact/files/2011/02/DPPG_Meeting_2011_PROGRAM_A4_ohne-header.jpg HTTP/1.1” 200 106526 “-” "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:26.0) Gecko/20100101 Firefox/26.0"
130.223.16.101 - - [22/Jan/2014:00:09:14 +0100] “GET /wpmu/cinn/feed/ HTTP/1.1” 304 - “-” "Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 Lightning/2.6.3"
65.55.24.218 - - [22/Jan/2014:00:09:14 +0100] “GET /wpmu/musique/feed/ HTTP/1.1” 200 54282 “-” "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
130.223.16.101 - - [22/Jan/2014:00:09:14 +0100] “GET /wpmu/allezsavoir/feed/ HTTP/1.1” 304 - “-” "Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 Lightning/2.6.3"
65.55.24.218 - - [22/Jan/2014:00:09:15 +0100] “GET /wpmu/fae/fagenda/action~agenda/tag_ids~69971,69999,69988,8732/ HTTP/1.1” 200 73655 “-” "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
83.233.207.74 - - [22/Jan/2014:00:11:11 +0100] “GET /wpmu/esvdc/forums/users/foresvdc/ HTTP/1.1” 200 11085 “-” "Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0"
83.233.207.74 - - [22/Jan/2014:00:11:12 +0100] “GET /wpmu/esvdc/forums/forum/research-2/ HTTP/1.1” 200 13327 “-” “Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0”
~
~
and then the content of the the /var/log/piwik/test.out
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log /var/tmp/stats/app/access_test…
Purging Piwik archives for dates: 2014-01-21
To re-process these reports with your new update data, execute the piwik/misc/cron/archive.php script, or see: How to Set up Auto-Archiving of Your Reports - Analytics Platform - Matomo for more info.
Logs import summary
13 requests imported successfully
2 requests were downloads
0 requests ignored:
0 invalid log lines
0 requests done by bots, search engines, ...
0 HTTP errors
0 HTTP redirects
0 requests to static resources (css, js, ...)
0 requests did not match any known site
0 requests did not match any requested hostname
Website import summary
13 requests imported to 1 sites
1 sites already existed
0 sites were created:
0 distinct hostnames did not match any existing site:
Performance summary
Thanks a lot for your help …
Best regards
Total time: 0 seconds
Requests imported per second: 37.22 requests per second