Clarity on Import Log Analytics Python FAQ

Matomo friends, Hi I was reading the FAQ ( https://matomo.org/faq/log-analytics-tool/ ) on Log Analytics and need some clarification. If I had a centralized matomo instance on it’s own servers and I have a client website (hosted elsewhere) that is interested in using Log Analytics to track activity (as opposed to the generated JS block). It isn’t clear to me, from the FAQ, what are the requirements for the client system as opposed to the requirements of my central matomo instance. Please confirm I have this right:

Requirements for central matomo instance:

  • Matomo instance up and running
  • Unique website profile with siteID=X
  • Python ?

Requirements for remote client website:

  • matomo full software distro installed (with PHP?) (Full Matomo package? )
    OR just the matomo-log-analytics package found here: https://github.com/matomo-org/matomo-log-analytics ?
  • Python (ver to match matomo base ver)
  • Execution of the “import_logs.py” script in a cron or other schedule
  • Execution of the script to include identity of central matomo: “–url” parameter
  • Execution of the script to include unique siteID prebuilt on central matomo instance: “–idsite” parameter
  • Execution of the script to include authentication for central matomo instance: “–token-auth” parameter

Thanks for your help!

Hi,

I think you understood it correctly.
The server who has Matomo needs nothing apart from the normal Matomo setup (no python).

The server(s) who contain the log files, don’t need any Matomo setup or PHP at all, but just the log analytics python script and python. (you can get the script from https://github.com/matomo-org/matomo-log-analytics/ too)

The only difference is that you need to specify the --token-auth parameter, which when you are running the python script on the Matomo server, it can fetch it automatically.

One day in the future the python script might be replaced with a PHP script so that Matomo doesn’t depend on python at all.

Thank you for the response Lukas. If anyone else who has run this setup in their shop could confirm, it would be welcome.
Thanks,

Maybe off topic a bit, but if anyone ends up here looking for a basic example of syntax, I am parking some notes here.

How to import the logs to a test site
I tested these here, just now.

  1. Please check which site ids exist. User your Matomo interface or look in the db*, either is fine.
    :gear: >> Measurables >> Manage

  2. In Matomo interface click >> :gear: >> Measurables >> Manage >> new measurable. Make a new site called “testing site for me” or whatever

  3. Run your python log importer. Use the site ID from Step 2. See that I added an option for which dates to import**.

  4. In your Matomo interface look at the reports data for that test site.
    In Visitors >> Overview there should just be all zeros, nothing, because we didn’t run any core:archive to make reports yet
    In Visitors >> Visits Log all of the data should show up. If the import worked, data will be here.

  5. Not needed if you succeeded in 4, but if you want to look in the db too, could do this:
    SELECT COUNT(*) AS hits, idsite, MIN(server_time) as oldest, MAX(server_time) AS newest FROM matomo_log_link_visit_action GROUP BY idsite;

*the query to see IDs
SELECT idsite FROM matomo_log_link_visit_action GROUP BY idsite;
**the python options I used to execute the import, in Step 3
sudo python3 /var/www/EXAMPLE.com/matomo/misc/log-analytics/import_logs.py --url=https://EXAMPLE.com/matomo --idsite=14 --recorders=4 --enable-http-errors --enable-http-redirects --debug-tracker --exclude-older-than="2024-10-31 01:01:01 +0500" --token-auth=????????????????93c /var/log/apache2/access.log

Here is that same syntax broken out by line

sudo python3 /var/www/EXAMPLE.com/matomo/misc/log-analytics/import_logs.py
–url=https://EXAMPLE.com/matomo
–idsite=14
–recorders=4
–enable-http-errors
–enable-http-redirects
–debug-tracker
–exclude-older-than=“2024-10-31 00:00:01 +0500” < - - - the 0500 part is hours from UTC to your server time zone
–exclude-newer-than=“2024-11-04 00:00:01 +0500”
–token-auth=???3c
/var/log/apache2/access.log