How is information moved?

How is information moved about within the architecture of PIWIK. I am completely new to it and learning on the fly.

Two theories not sure which one is right or if either are.

  1. Say you have three servers an A, B, and C
  2. both servers receive hits from an outside source
  3. Servers send the hits to a third server (MYSQL) server
  4. the Cron job (not sure if this is hosted by both A & B or not) grabs the hit information from the MySQL database and transforms them sending them off to PIWIK for display.

OR

  1. There are three servers A, B, and C
  2. When a hit comes in it pushes it to a file (idk what this file would b, a tmp?)
  3. Cron grabs the hit info from the file and sends it to mysql. One of the servers that is hosting piwik grabs the information from the mysql server and displays the graphs.

Neither of these ways may be how PIWIK actually works at that level but any help would be awesome. To sum it all up we have virtual machines running piwik and the data is becoming corrupted. We don’t know how PIWIK fundamentally works. We would like to do:

Two webservers A & B with piwik and apache loaded on. Has config files which point to a third server the DB which houses all the information. Again, not sure on how it passes the info around if you have multiple servers

Any help would be appreciated! Feel free to PM me

Piwik stores ALL data in the database. So you have a master DB that has all data and system status.

The only file that must be replicated over all piwik server is config/config.ini.php

Otherwise all other things are in the database (including sessions) so no need to share other files between the various piwik servers.

Matt,

Thanks for answering back. To go further with this, and please correct me if I have things wrong; the sessions or “hits” hit the piwik/webserver and are pushed directly to the DB server via the config file. The data is then taken out of the DB server by the cron job and pushed back to the piwik/webservers for display? If this is the case can we have A and B run cron jobs to balance out the load? How can you set up two different cron jobs? Would you just have it so A server does X many sites and B servers does the other X.

I found session data here: is this right to have on one of the PIWIK servers or should I have found this on the DB server

/htdocs/piwik/tmp/sessions

Piwik session data is stored in DB as per 1.5 release

for archive.sh cron help see How to Set up Auto-Archiving of Your Reports - Analytics Platform - Matomo and discussions in archive.sh could execute in multithreaded mode for better performance · Issue #2563 · matomo-org/matomo · GitHub etc.

thanks for the useful information.

Do mind that database sessions will be reverted, cf. Piwik_Session: restore support for file-based sessions · Issue #2602 · matomo-org/matomo · GitHub

If you need database sessions, please consult Installation - Analytics Platform - Matomo