Is it possible to run import_logs.py on a different machine and import the results to Piwik? I have a couple years worth of logs I want to parse.
Of course. Here is my crontab example:
2 1-23/4 * * * /mnt/import_logs.py --url=http://piwik_server_ip/piwik/ --idsite=1 --recorders=2 --token-auth=my_token --enable-static --enable-bots --enable-http-errors --enable-http-redirects /var/log/apache2/access.log > /dev/null
It will parse apache log every 4 hours.
–token-auth you can find it out at “API” link.
–idsite means the website’s id at Websites Management.
I use scp to copy import_logs.py from piwik server to a target server. Make sure you have python2 installed.
Thanks, that did the trick. I didn’t know about the --token-auth= option. Unfortunately, I’m very disappointed in how slow the script seems to be. The script I ran is:
python import_logs.py --url=https://www.example.com/piwik access.log --token-auth=my_token --idsite=site_id --enable-reverse-dns --recorders=2 --exclude-path=/w/* --exclude-path=/blog/* --exclude-path=/piwik/* --exclude-path=/forums/*
I initially ran with the --dry-run option and got an average 700 records/sec. The script is currently running at 9 records/sec, and going down. Neither machine has any other load to speak of right now. I’m guessing this is because of --enable-reverse-dns? I used that option because I wasn’t sure if the maxmind database information would be applied to the imported log data.
remove reverse DNS + use Piwik 2.1 for better speed