PIWIK Log ANalysis


#1

Hi,

We have started testing the PIWIK log analysis feature launched in the 1.8.2 version of piwik. We have tried to do some log analysis and though we managed to get data, we face the following problems:

  1. Log analysis did not happen with w3svc format of iis logs. We had to convert them to .log.ncsa format? Isnt log analysis supported with w3svc format?
  2. Can we do log processing with load balance servers?
  3. Is reprocessing possible from a certain date if logs are available?
  4. Can we create custom reports like “visitor vs browser” or “visito vs country”?
  5. Can we provide access/remove access to new users? I mean is ther user management available?
  6. Can the reports available be auto scheduled?

Please let us know the answers of the above queries. Also is there any help documentation on the log analysis feature of PIWIK?


#2

Have you seen the information at: Log Analytics - Analytics Platform - Matomo many of your questions are already answered there. Other than that you can have a look at the code directly, it’s a python script that’s quite easy to understand.


#3

Hi aseques,

Consider me a noob a new in all this. Sorry to pester you but i would really appreciate if you can answer my questions here :). I did not find the documentation that descriptive


#4

Ok, let me see for some of the points (don’t know for the others)

  1. Log analysis did not happen with w3svc format of iis logs. We had to convert them to .log.ncsa format? Isnt log analysis supported with w3svc format?
    w3c format should be suported, see this for the format: (http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/bea506fd-38bc-4850-a4fb-e3a0379d321f.mspx?mfr=true) more info in the quote at the end.

  2. Can we do log processing with load balance servers?
    All the process of database import is done through piwik api, and the bottleneck is there, if you’ve got piwik install balanced, you already have the hard part ready. (New to Piwik - Analytics Platform - Matomo)

  3. Is reprocessing possible from a certain date if logs are available?
    Not at the moment, but is one of the pending issues, someone has to step in or wait for the solution

  4. Can we create custom reports like “visitor vs browser” or “visito vs country”?
    That’s piwik task, not log importer, but it can be done via API

  5. Can we provide access/remove access to new users? I mean is ther user management available?
    You will have to do that via API too, it allows you to create/delete/modify users.

  6. Can the reports available be auto scheduled?
    Don’t know about this one :frowning:

A typical workflow can be followed here: Log Analytics - Analytics Platform - Matomo

This is a code snipped of the w3c format.

class IisFormat(object):

    def check_format(self, file):
        line = file.readline()
        if not line.startswith('#Software: Microsoft Internet Information Services '):
            file.seek(0)
            return
        # Skip the next 2 lines.
        for i in xrange(2):
            file.readline()
        # Parse the 4th line (regex)
        full_regex = []
        line = file.readline()
        fields = {
            'date': '(?P<date>^\d+[-\d+]+',
            'time': '[\d+:]+)',
            'cs-uri-stem': '(?P<path>/\S*)',
            'cs-uri-query': '(?P<query_string>\S*)',
            'c-ip': '(?P<ip>[\d*.]*)',
            'cs(User-Agent)': '(?P<user_agent>\S+)',
            'cs(Referer)': '(?P<referrer>\S+)',
            'sc-status': '(?P<status>\d+)',
            'sc-bytes': '(?P<length>\S+)',
            'cs-host': '(?P<host>\S+)',
        }
        # Skip the 'Fields: ' prefix.
        line = line[9:]
        for field in line.split():
            try:
                regex = fields[field]
            except KeyError:
                regex = '\S+'
            full_regex.append(regex)
        return RegexFormat('iis', ' '.join(full_regex), '%Y-%m-%d %H:%M:%S')