Matomo log import very slow since update to Matomo 4

Hello,

We are using a cron job to upload log files from our Nginx reverse proxy to Matomo.

After we upgraded Matomo to version 4, the log import via the import_logs.py script is extremely slow and caused high CPU load on our database server. Before the update everything worked fine.

When running an import, only every 30 to 80 seconds records are written. The remaining time we only see 0 records/sec. Like here:

3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 2 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 274 lines recorded, 1 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 501 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)
3350 lines parsed, 775 lines recorded, 5 records/sec (avg), 0 records/sec (current)

We also tried to restore the database to a new server with MySQL Version 8.0.23, but there we could see the same behavior.

On a new, empty database the import of the logfiles works like a charm.

Here is the command we use to upload our log files:
sudo python3 /var/www/matomo/misc/log-analytics/import_logs.py --url=http://<site-URL> --add-sites-new-hosts --recorders=4 --force-lowercase-path <log file>
We also tried to use different values for recorders, also with no difference.

When monitoring the MySQL database we saw the following queries during the import which where slow and could cause the problem:

Info about the system:

MySQL:

  • Version: 5.7.26
  • CPU: 4Cores
  • MEM: 32GB
  • Table size:
    – piwik_log_link_visit_action: ~5,8GB
    – piwik_log_visit: ~150MB
    – piwik_archive_blob_XXXX: ~80MB
    – piwik_archive_numeric_XXXX: ~25MB

Matomo Server:

  • Matomo Version: 4.2.1
  • OS: Ubuntu 20.04.2 LTS
  • CPU: 2 Cores
  • MEM: 4GB
  • PHP: 7.4.3
  • Python: 3.8.5
  • Webserver: Apache/2.4.41
    The System check shown no errors.

Log file:

  • Type: Nginx access log file
  • Size: ~330 MB, ~1000000 lines

Can someone please advise us on how to fix this problem?

Thank you

Hello,

we found a solution for the problem with slow performance during the import.

We updated the index ‘index_idsite_idvisitor’ of the table ‘piwik_log_visit’ with the following SQL statement:

ALTER TABLE `piwik_log_visit`
DROP INDEX `index_idsite_idvisitor` ,
ADD INDEX `index_idsite_idvisitor` (`idsite` ASC, `idvisitor` ASC, `visit_last_action_time` ASC);

This increased our import performance from 4-5 records/sec to ~450 records/sec.

The query_time for the select statement that caused the slow performance (see screenshot in the previous post) decreased from ~0,28 sec to ~0,0002 sec.

1 Like

Hi,
Out of curiosity could please explain how you arrived to this particular statement? Did you try making changes to mysql conf etc?

Hi,
we used the slow query log to monitor what happens when Matomo imports data, and so we found the Query (the ‘select’ posted above), which slowed the process down.
With this knowledge, we tried out some changes to the MySQL conf but could not solve it this way.
So we started looking into the Query itself to optimize it.

Hi,

This might be related to

This might help

To workaround this performance issue set [Tracker]enable_userid_overwrites_visitorid=0 in your config/config.ini.php if userId is used. If userId is not used, there is currently no workaround.