Hi,
I am trying to implement log analyzing also for logs from IIS.
We historically have this fields in logs:
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
So import_logs.py not recognize log format and with –log-format-name=iis I receive this error: AttributeError: ‘IisFormat’ object has no attribute ‘regex’.
I added to import_logs.py:
_OWN_IIS_LOG_FORMAT = (
'(?P<date>^[\d+]+[-\d+]+[\s]+[:\d+]+) \S+ \S+ (?P<path>\S*) (?P<query_string>\S*) \S+ \S+ (?P<ip>([\d.]|[\S:])*) (?P<user_agent>\S+) (?P<referrer>\S+) (?P<host>\S+) (?P<status>\d+) \d+ \d+ (?P<length>\S+) \S+ \d+'
)
and add this line to FORMATS:
'iis2': RegexFormat('iis2', _OWN_IIS_LOG_FORMAT, '%Y-%m-%d %H:%M:%S'),
Then use –log-format-name=iis2. Unfortunately I don’t know how to use regex with parameter --log-format-regex .
Other problem is IIS save - (dash) to field “cs-uri-query” when page is called without query string. Because then page in Piwik looks like index.php?- , it was necessary to change this:
if hit.query_string and not config.options.strip_query_string:
path += config.options.query_string_delimiter + hit.query_string
to this:
if hit.query_string and not config.options.strip_query_string and hit.query_string!="-":
path += config.options.query_string_delimiter + hit.query_string
Another way it would be to change regex.
I am not Python developer and worry about my changes with Piwik update …
Jiri