Just recently we noticed that the Network Link of the Redis Instance can become utilized with more than 100 MB/s outgoing data and ~ 2 MB Incoming data. The Redis Instance becomes a Bottleneck in the System.
If we move from a shared Redis Cache to a File based Cache Backend the Link saturation immediately dropps significantly
Is this normal that it needs to cache this much data and are there any suggestion to improve this setup ?
What are the Drawbacks to using file caches on every loadbalanced tracking request server against the recommendations of the faq ?
Would it help to chain in the file cache (array -> file -> redis-server) in the chained setup ?
Yes that’s possible. You can chain the file caches, however, the problem is you need to synchronize the deletion of file caches across servers. When you have several servers, the file cache would be only deleted on one server but not another. I’d recommend to get in touch with www.innocraft.com where they possible have some features around that.
Matomo itself deletes those caches regularly. There are many different caches with different TTLs. Some of them have a TTL of 5 min, some 1-4 hours etc. Also they get deleted based on certain actions in the UI and if files are then not deleted across servers, there will be some random problems occurring very likely. Yes tracking is also using this cache (especially tracking, important for fast tracking) plus the reporting parts.
Our m5.large redis instance at AWS is at peaks transferring 500GB/hour, 10GB/minute, 170MB/second. The number of read-only (get, hget, scard, lrange, etc.) operations are at the same time ~9000 ops/sec
We run matomo in a HA, multi az setup where each of our components are split into the following components/containers which scale independently:
tracker api - 1-32 instances, receives all tracker api (matomo.js and matomo.php) requests and puts them into redis (queuedtracking plugin)
web ui - handles all non-tracker-api
process containers - 1 to 16 instances based on the current load (queuedtracking plugin)
archive container - 1, doing core:archive command every 1800s
Currently this is a killer cost, cross-AZ traffic is 40-50$ a day, and currently we only have one site where we test Matomo.
Quick fix for us is to run everything in one AZ to limit cross-AZ transfer cost, but we would like to know more about the mechanics around the redis cache, since this won’t scale for us long term.
@fkaufmann did you find a solution that was sufficient for your workload and requirements ?
@thomas_matomo is there anything around rediscache that can be improved ?
We have been able to pinpoint this to be an issue with QueuedTracking of some sort.
When we activate the plugin, we see an uneven distribution of network bytes in vs out on our Redis instance. There is 310MB/min of traffic going out of Redis, while there is only 10MB/min going in to Redis. During our peaks, the values are 7.82GB/min out vs 250MB/min in. Thats a 31x difference in out vs in traffic, doesn’t make sense.
We will continue to debug this issue and report back if we identify anything. Our initial research indicates a high number of GET operations from items managed by the RedisCache function in matomo core, just after a tracking request is LPUSHed to redis.
If the change the Cache=>backend type in Matomo Core to “file”, we don’t see this traffic pattern at all (as expected). But it is unclear to us why there needs to be so many GET operations from redis just after an LPUSH.
Our config for RedisCache is as such:
backend = chained
Tore, we had a similar challenge and we managed to get higher throughput with the help of 16 worker queues processing each 100 req. Refer to this thread for more details.
It also depends on where you are running the CRON jobs (queue workers) to process the queues. If you are having it in the same machine as the DB server, it would have an impact on the mysql processing speed, rather it is recommended to have a separate server only for the cron jobs processing.
Processing of the queue is not the problem, our issue is that Matomo is requsting 31x more data from redis when inserting a tracking request into redis. A tracking request of e.g. 10 KB turns into 10KB written to redis and 310KB read from redis, which IMO doesn’t make sense for a single tracking request written to the queue.