Mooash
(James)
September 6, 2013, 12:49am
#1
Hi all,
I’m currently using the below snippet to log my apache logs
LogFormat "%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" piwik
Now i know I can add the %D to the end so I can get page generation time like so
LogFormat "%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D" piwik
But how do I then format that in regex so my piwik import script imports it correctly? I know I can add it through --log-format-regex"blah" but I’ve got no idea how to format that in regex. Can anyone help?
Cheers,
Mooash
Ademaro
November 3, 2013, 9:28pm
#2
I have nginx log configuration:
log_format piwik '$host $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_time';
File output is roughly this:
my_tracked_domain.org 178.122.228.76 - - [03/Nov/2013:06:22:41 +0400] "GET /backend/service/index HTTP/1.1" 200 2464 "http://my_tracked_domain.org/backend/news/index" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36" 0.100
I first ran the following command:
python /var/www/s.rent-shop.org/misc/log-analytics/import_logs.py --url=http://s.rent-shop.org/ /var/www/s.rent-shop.org/tmp/logs/test777-31-01nov.log --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --enable-bots --log-format-regex='(?P<host>[\w\-\.]*)(?::\d+)? (?P<ip>\S+) \S+ \S+ \[(?P<date>.*?) (?P<timezone>.*?)\] "\S+ (?P<path>.*?) \S+" (?P<status>\S+) (?P<length>\S+) "(?P<referrer>.*?)" "(?P<user_agent>.*?)" (?P<generation_time_milli>\S+)
But there was an error and I had to make a patch:
$ git diff misc/log-analytics/import_logs.py
diff --git a/misc/log-analytics/import_logs.py b/misc/log-analytics/import_logs.py
index 64f3738..8acf924 100755
--- a/misc/log-analytics/import_logs.py
+++ b/misc/log-analytics/import_logs.py
@@ -1539,7 +1539,7 @@ class Parser(object):
hit.length = 0
try:
- hit.generation_time_milli = int(format.get('generation_time_milli'))
+ hit.generation_time_milli = int(float(format.get('generation_time_milli')) * 1000)
except BaseFormatException:
try:
hit.generation_time_milli = int(format.get('generation_time_micro')) / 1000
Now for me import works fine (just slowly)…