Apache CustomLog or import_logs.py Error


#1

I am running Piwik 1.9.2 on a RHEL 5.7 server running Apache.

I am trying to implement the Apache CustomLog that directly imports into Piwik as described in this README. I am not sure if I have a problem with my configuration or if there is a potential bug in the Piwik import_logs.py script. After some poking around on the command line it seems that the script works perfectly when it is given an entire file but when you try to feed it a single line from a log file it crashes. I have included my cmd output below for you to view. Any help would be greatly appreciated. Also if you need any additional information please let me know!!

Firstly let me pull the first line of my logfile to show its syntax:


[katonj@mimir2:log-analytics ] $ head -1 boarddev-beta.teradyne.com.log
boarddev-beta.teradyne.com 131.101.52.31 - - [12/Nov/2012:11:16:24 -0500] "GET /boarddev/ HTTP/1.1" 200 10541 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.94 Safari/537.4"

Now when I run the file as the apache configuration suggests I get the following (Note: if I do not put the “-” at the end of the command the line from the logfile is ignore and the script simply outputs the README file):


[katonj@mimir2:log-analytics ] $ head -1 boarddev-beta.teradyne.com.log | ./import_logs.py  --add-sites-new-hosts --config=../../config/config.ini.php --url='http://boarddev-beta.teradyne.com/analytics/' -
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log (stdin)...
Traceback (most recent call last):
  File "./import_logs.py", line 1462, in <module>
    main()
  File "./import_logs.py", line 1426, in main
    parser.parse(filename)
  File "./import_logs.py", line 1299, in parse
    file.seek(0)
IOError: [Errno 29] Illegal seek

And finally if I run the file itself through the script I get the following showing that it loves processing the logfile as long as it gets an entire file fed to it all at once:


[katonj@mimir2:log-analytics ] $ ./import_logs.py  --add-sites-new-hosts --config=../../config/config.ini.php --url='http://boarddev-beta.teradyne.com/analytics/' boarddev-beta.teradyne.com.log
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log boarddev-beta.teradyne.com.log...
Purging Piwik archives for dates: 2012-11-12
To re-process these reports with your new update data, execute the piwik/misc/cron/archive.php script, or see: http://piwik.org/setup-auto-archiving/ for more info.


Logs import summary
-------------------

    8 requests imported successfully
    0 requests were downloads
    0 requests ignored:
        0 invalid log lines
        0 requests done by bots, search engines, ...
        0 HTTP errors
        0 HTTP redirects
        0 requests to static resources (css, js, ...)
        0 requests did not match any known site
        0 requests did not match any requested hostname

Website import summary
----------------------

    8 requests imported to 1 sites
        1 sites already existed
        0 sites were created:

    0 distinct hostnames did not match any existing site:



Performance summary
-------------------

    Total time: 0 seconds
    Requests imported per second: 24.01 requests per second


(Matthieu Aubry) #2

It is possible there is a bug. Can you please report it in: Log analytics list of improvements · Issue #3163 · matomo-org/matomo · GitHub ?


(Matthieu Aubry) #3

This bug has been fixed in: Merge pull request #294 from etmatrix/master · matomo-org/matomo@7513fef · GitHub

See ticket: import_logs give a IOError: [Errno 29] Illegal seek when receiving log from pipe · Issue #5254 · matomo-org/matomo · GitHub

please test the patch and report if there is still a problem for you?