New user - confused about Log Analytics

mdegalli · May 16, 2017, 1:05am

I’ve installed and have Piwik up and running on my Linux server. I’ve imported the log files per the documentation. I am confused on how Piwik will now track users in Real-time. There doesn’t seem to be mention of this in the docs.

The docs mention “archiving” and setting a cron to do that … but what does “archiving” do? Does it pull in the new data? How does Piwik get new data and keep the reporting up-to-date?

Thanks for any answers or point me to the documentation!

-MD

SteveG · May 16, 2017, 10:15am

Hi there,

If you choose Log Analytics instead of tracking users by tracking pixels, you need to set up an automatic cron that imports your latest log files.

As those log files will always be some time old, there won’t be any real-time analytics possible.

mdegalli · May 16, 2017, 2:42pm

Thanks for the response!

I understand it won’t be perfectly real-time, but how would I import the logs daily without producing duplicates?

What does archiving do? The docs suggest doing a cron hourly … but if nothing is changing (without importing logs at least hourly) what is the point of archiving?

Thanks again!
-MD

SteveG · May 16, 2017, 3:31pm

Archiving builds the reports out of the tracked raw data. This doesn’t make sense for log import on an hourly basis.

In order to not get duplicates you need something like automatic log rotation. Every time after the log was rotated, you can let the log importer run. That would ensure each entry is only processed once

mdegalli · May 16, 2017, 9:38pm

So, here’s what I did … I’m using CentOS / Apache … in case anyone else can benefit or improve on my method:

In my /etc/logrotate.d directory, I edit the httpd config file as follows:

/var/log/httpd/*log {
    Missingok
    Notifempty
    Sharedscripts
    Delaycompress
    Postrotate
        /sbin/service httpd reload > /dev/null 2>/dev/null || true
    Endscript
    Firstaction
        python /var/www/[my domain here]/piwik/misc/log-analytics/import_logs.py --url=https://[my domain here]/piwik --idsite=1 --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --enable-bots /etc/httpd/logs/access_log
    Endscript
}

I will rotate the logs via /etc/cron.daily. This should be good enough for now. If anyone has a better way, let me know!

Thanks,
-MD