Importing remote logs to Piwik

I wanted to confirm how to correctly handle importing of remote logs to Piwik.

As I understand it, Piwik uses Python script to parse the logs and it imports it to Piwik. Am I correct to assume that I can use this Python script on hosting servers and import those logs to Piwik on another server?

This isn’t specifically mentioned in the instructions/FAQs, so I wanted to confirm before I begin testing. Since Piwik’s base URL is the required parameter, I assume importing remote logs should be OK.

Thanks!

Hi,

I don’t have much experience with LogImports but I think what you are trying to do will work.
You can find more information here:

You must specify your Piwik URL with the --url argument. The script will automatically read your config.inc.php file to get the authentication token and communicate with your Piwik install to import the lines. The default mode will try to mimic the Javascript tracker as much as possible, and will not track bots, static files, or error requests.

But I think you’ll need to use the token_auth parameter. It doesn’t seem to be that well documented but you can find information about it in this github issue and the output of the --help parameter:

Thanks Lucas. This is very helpful. I’ll run some tests, hopefully I’ll be able to get it working. I’ll try to contribute to the instructions on Github once I got it to work.

1 Like

I use Piwik log-analytics with my Apache web server.

There are few things, which I considered, which may be useful to you:

Where to put Piwik server?
I can put it on the same server, as web server. That would make it more secure, because log importing will happen internally, - there is no need to open Piwik endpoints to the net. However, my web server has limited IOPS and memory, and Piwik can put extra load on it - some of it’s MySQL queries are rather heavy.

So, I decided to host Piwik on separate server.
So, what is the proper way to process logs, when Web server and Piwik are on remote hosts?
1st though was to run script in cron, which will process log file on remote web server OR send local log file to remote Piwik. The problem here is that if anything fails here - connection lost, Piwik down, etc… - that day will be lost and will need to be recovered manually. I didn’t find good retry mechanism for Piwik log file processing.

So, my current solution:

  1. Piwik server has nightly cron job, which MOVE (not copy) via RSYNC log file(s) (all, except live) from web server to Piwik server “processing” folder. If anything fails one day, - the next day 2 or more files will be moved.
  2. Then the cron job process (imports) all logs in “processing” folder and moves them into “backup” folder. That processing is done locally, so it’s secure and reliable.

I had issue, when import job was crashing on some days:

But it was easy for me to replay logs, by using that scheme.