Is the Myrocks mariadb storage engine a good follow up for TokuDB?

To lower the storage requirements for large Matomo installations, matomo.org has an FAQ to use an alternative to the default storage engine in MariaDB/Mysql (InnoDb), which is called TokuDB . (https://matomo.org/faq/how-to-install/faq_20200/)

However, TokuDB has been marked as deprecated. Already quite some years ago. (https://mariadb.com/kb/en/tokudb/)

The successor to TokuDB is MyRocks. But reading through MariaDB release notes the InnoDB storage engine also has had some updates over the last years. I cannot find any other reference in the Matomo Source code to TokuDB, so I assume that setting the database type in the config.ini.php from InnoDB to another one is passed on directly to the database creation calls.

Is anybody using MyRocks today? There was one similar question on this forum in 2019, without any replies. (MyRocks as database?)

It concens me a bit that matomo.org is still suggesting TokuDB when it was deprecated at least 2-3 years ago…

Hi @fredvd
Maybe @blankko1 did some tries and could share his experience?

We are using Percona Server 5.7 with the TokuDB engine which is supported until 2024.

There is nothing to do on Matomo side, all the configuration is on the MySQL side (install TokuDB engine, force engine on tables).
Edit: based on FAQ20200 you linked I see that the engine can now be configured on Matomo side. Back when TokuDB was released I don’t think this was a possibility so we used the Percona Server feature to force engine used on tables. I wouldn’t use any other server in production by the way, whether MariaDB or MySQL.

I have not used MyRocks but would be interested to know how it performs. Several years ago when we moved to TokuDB the storage and performance benefits were massive, in the order of several tens of percents, on a very busy instance.
It is true that InnoDB has had progress as well but I am not sure we’re talking about several tens of percents.

So unfortunately I cannot help you on MyRocks but the experience with TokuDB was very positive.

2 Likes

I tried playing around with MyRocks for Matomo ~6 months back. But I actually saw an increase in disk writes after switching.
I did some research and quickly found out, that there was ~50 performance tuning variables you could and should tweak in MyRocks to make it suit your needs. It became obvious that it needed a lot more time and dedication to get running nicely.
So I switched back to InnoDB for the time, as I didn’t have the understanding or the time to learn how to tune MyRocks.

However the storage savings were significant. I don’t remember the exact improvement, but I remember being very impressed.

It seems strange that there is so little (none) information or publicly documented experience in running Matomo on MyRocks.

2 Likes

Okay, I actually used this occasion to try switching again, this time on MariaDB 10.9 instead of 10.8.
I also spent some more time getting familiarised with the most essential config variables.

The instance that I switched over had a DB size on disk of 38 GB. After the switch it shrunk to 23 GB.
The first rule of optimising MyRocks is to not use case insensitive collations, so converting all tables from utf8mb4_general_ci to utf8mb4_bin took almost a day of trial, error, and waiting.
After the conversion the database on disk is now 18 GB.
The memory usage allowed for MariaDB with InnoDB was 36 GB, and it used all of it. After the switch and 24 hours of uptime, it is now consuming 8 GB, while being allowed to use 32 GB for rocksdb_block_cache_size.

The large date range requests in Matomo have become a tiny bit slower (~5%) and CPU consumption related to the tracking events have increased by 20-25%, which can probably be explained by the more intense compression performed during inserts.

I have not tweaked any column family compression strategies. But it might be possible to get the CPU consumption back down to previous levels, if you use some less intensive compression strategy. However this is actually perfect for our use case, so I won’t spend more time tweaking it for now.

I’m gonna give this some months and see how it performs, and if no weird behaviour pops up, we’ll probably stick with it, and roll it out to more instances.

The conversion to a binary collation made me a bit nervous - But it seems Matomo handles it like a champ (which makes me wonder why a binary format wasn’t chosen by Matomo to begin with. No matter what engine, a binary format must be much more efficient for doing comparisons in the SQL layer).

2 Likes

In case anyone finds this post, in their search for RocksDB/MyRocks answers, I’m gonna give a quick update on my experience.

I have had to switch back to InnoDB. The reason being that Matomo issues a butt-ton of GET_LOCK commands which seem to be kryptonite to RocksDB. It stalls completely, trying to acquire these locks for Matomo. The locks have a set timeout of 60 seconds so during these 60 seconds any other query trying to access the same resources will also just stall.
I have no insight into why these user locks are necessary. Maybe they have been implemented due to some method of interacting with Postges or InnoDB, but they kill any chance of using Matomo with RocksDB.

It really is too bad, because the disk space and IO usage with RocksDB is absolutely awesome. In a cloud world, it would also save a small fortune in IO related costs.

This chart from Perconas testing back in 2018 should speak for itself

I hope the project will at some point start looking into making RocksDB a viable option, as it seems like a match made in heaven and only a matter of time before RocksDB start grabbing more InnoDB marketshare - Just not at the moment due to the user acquired locks that seem to be scattered around all throughout the Matomo code base.