I know that it say’s you can use it. But the problem is that once you go that route you need to be very aware how the tmpfs is used. We used the tmpfs initially too, as described in the manual.
The tmpfs can be used to speed up things in RAM, that is for sure. However when running concurrent archivers and concurrent request per website, you can run out of tmpfs space. So either you monitor that very closely or you run less concurrent archivers.
tmpfs is used to store intermediate resultats from queries, for example parts of queries that need joining.However if you have let’s say 4 concurrent archivers running and also 5 website requests per archiver, then you can have 20 concurrent queries running. All using the same tmpfs to “speed up” your work.
In case the tmpfs gets low, your queries will start to wait for each other. You can even end up deadlocking your queries completly, and the archivers just stop and wait forever.
Yes, I have seen all situation occure. So I went back to tmpfs on highspeed SSD, rather than 64GB of memory. However if you have plenty of memory, then please use RAM. But remember to monitor your tmpfs usage in that case. It should never ever be 80% full at any one point.
Also if you have high volume traffic the queries just take up a lot of tmpfs, in that case you really do not want a memory based tmpfs either. Even a single archiver with a single query could potentionally fill up your tmpfs.
And once you do, the only way to recover fast, is to reboot your database.
So my advice, do not use tmpfs, unless you can have a lot of memory to use and closely monitor your tmpfs usage.
The options you ask about:
“It is very important to me if there is an option to create an archive with these options for only one website ID and not for all.”
Yes, it possible to setup crontabs to just archive a single siteid at a time. This command does the trick:
Read this too: How to Set up Auto-Archiving of Your Reports User Guide - Analytics Platform - Matomo
This will enable you to have full control on which siteid’s are run. I recommend setting up a shell script that executes them in order. Do not try to schedule them using cron, you will assume that after a period of time the previous archiver has done its work, but you cannot assume that ever. Sometimes queries just take a very long time, it totally depends on the number of segements and volume of data that needs processing.
Do schedule the archiver work on a regular basis. I use a cronjob right now on our archiverserver that executes the processing of ALL siteids, it’s launched every 15 minutes. The console command only adds more archivers, if it is not already running the maximum number.
The options we use right now:
- concurrent-archivers = 4
- concurrent-request-per-website = 5
It is important to realize that this means you can have up to 4 * 5 = 20 querý’s to your database at the same time. This means your database tuning needs to be done well. This is were master - slave setup comes into play. You do not want your ingest traffic and archiver query’s running on the same database, that’s asking for problems with performance.
So if you are doing high volume, you do want to split up the database in incoming write traffic, on the master database and query’s on a read slave databse. Then on the read database you run as many io_read_threads as you can. Match them up with the number of CPU’s.
So let’s say you have a slave, running on 32CPU’s, and you run 20 queries at a time, remember, that every query needs a single CPU to perform optimal. Then you might wanna setup 20 io_read_threads and 12 io_write_threads… now lets remember, the mariadb / mysql thread scheduler needs some of its own threads. But you want the io_wait to be as low as possible on you datbase.
Then on your limits from the FAQ, yes, if you move those limits high, to the suggested 5000, that will cause a lot of processing. Also remember, when you do so, you will have that for your segements as well. You might wanna experiment a little with lower limits.
I have no clue if you can run these limits for a single siteid… maybe @matthieu could comment on that himself.