Bulk Log import with record_statistics=0 not working


#1

Hi - we have been trying to setup Piwik 2.2.1 as described here: How do I configure Piwik Tracking for high reliability? - Analytics Platform - Matomo

Our setup uses nginx and we have a script that parses log files and imports the relevant GETs to piwik.php? using the import_logs.py file with --replay-tracking

The problem we have is that if record_statistics is set to 0, as suggested in the documentation, import_logs.py does not write anything to the database. With record_statistics =1, log import works just fine.

This can also be seen if we try something like:


curl -i -X POST -d '{"requests":["?idsite=2&url=http://example.org&action_name=Test bulk log Pageview&rec=1","?idsite=2&url=http://example.net/test123.htm&action_name=Another bulk page view&rec=1"]}' http://xx.xxx.xxx.xxx/piwik.php

and we get a response of


<pre>DEBUG Piwik\Common[2014-05-07 22:00:30] [0231e] The request is invalid: empty request, or maybe tracking is disabled in the config.ini.php via record_statistics=0</pre>

Once we set record_statistics = 1 it all works fine.

Now, the question is: Is this a bug for 2.2.1 or is this expected behaviour and the documentation should be updated to reflect this.

Thanks!


(Matthieu Aubry) #2

Thank you for the report. That’s a documentation bug indeed. I’ve updated doc at: How do I configure Piwik Tracking for high reliability? - Analytics Platform - Matomo

[quote=doc]

On the Piwik tracking server(s), disable synchronous tracking in the config file: in config.ini.php, below the [Tracker] section, set: record_statistics=0

[…]

To import the log file, in the Piwik server running the import, edit config.ini.php, and below the [Tracker] section, set:

record_statistics=1
[/quote]

is it more clear now?


#3

Great - thanks for the speedy response and update to documentation!

Am I right in assuming that the setup implies having multiple piwik nodes of which just one is actually able to track_statistics and that host is not hit by the tracker on sites?

If we use this setup on just on Piwik node it means that the moment we switch on track_statistics not only would we be dealing with the increased load of importing and archiving but we would also be receiving statistics in real time from the sites that we track.