Matomo + MariaDB Galera Cluster Cross Colo (East/West Coast), Op Timeout/DeadLock

DavidDPD · March 13, 2023, 6:07pm

When finally running under load, in an East-West Coast USA MariaDB Galera Cluster, I get pretty regularly -

[Warning] WSREP: Failed to report last committed 66266a7c-f62f-11ec-86bf-bfd29572294b:33450999, -60 (Operation timed out)

(which from MariaDB/Galera … is ok, as this is just a warning, it is suppose to be retried, and then must success, as doesn’t log an error. )

The config is

Baremetal/“internal” cloud (Xen XCP-NG), own colos.
** Cogent Transit, 10Gbps, over IPSEC VPN via Juniper SRX
** ping over IPSEC ~ 70 ms
MariaDB (10.6.10) with galera Cluster 26.4.12
** five nodes, two WEST, three EAST,
** Using HaProxy to distribute load, but by connections, so 90%+ DB connections go to node1.
FreeBSD 13-stable (as a jail) on bare metal
Web servers (6x, three each coast)
** FreeBSD 13-stable
** running under Xen (XCP-NG)

At some point, some process will block or lock or be long running, then ultimately

[Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)

And the entire cluster locks up.

The work around, right now, that seems to be running for about 7-8 days without issue, is that I redirected all the traffic to the three EAST Coast web servers, and thus all inserts happening on the EAST Coast DB Nodes. This is obviously not a solution.

I suspect this is a lock or block in …
console core:archive

As I write this, I had the core:archive running in both colos, thus, different DB nodes, every 5 minutes. It is possible that some of these jobs ran longer than 5 minutes, and core:archive may not have locking, and should ensure that only one core:archive process is running at a time.

I’m looking at also possibly moving to plugin-QueuedTracking - though would love this to be RabbitMQ based, with active pub/sub instead of polling/batch.

Is anyone else using Matomo + MariaDB Galera Cluster across medium to high latency connections? Any tips/hints or share your config/setup ?

Thanks.

heurteph-ei · March 27, 2023, 1:00pm

Hi @DavidDPD
I personally don’t use MariaDB.
But I would suggest, in order to unload the DB server, to use as you specified the Queued Tracking plugin. And also upgrade to the latest version of Matomo as some perf improvement are made from time to time…