Read vs Execute DB Connections


#1

A great feature might be to allow editing the config for separate DB connections for Writes (Create, Insert, Update, etc…) from Reads (Select).

We have a couple servers set up to handle the DB with replication. Unfortunately, our (psuedo) Slave still needs write permissions to handle things like session storage, dashboard changes, etc… In the interim, I removed the tables that the UI updates from the Master to avoid breaking replication.

Would be great to have something like the following in the config:

[database_write]
host = "piwik_master"
username = "user"
password = "password"
dbname = "piwik"
tables_prefix = "piwik_"
adapter = "PDO_MYSQL"
port = 3360
charset = “utf8”

[database_read]
host = "piwik_slave"
username = "user"
password = "password"
dbname = "piwik"
tables_prefix = "piwik_"
adapter = "PDO_MYSQL"
port = 3360
charset = “utf8”

If I get around to a clean solution, I’ll submit it - just thought I would get the suggestion out in case others ran into similar situations.


(Matthieu Aubry) #2

Thanks for starting the discussion it is certainly interesting. The more advanced doc we have so far is: New to Piwik - Analytics Platform - Matomo

Would you like to contribute an improvement to this FAQ and add your setup, which I think is different? are there special mysql settings to avoid some piwik tables from replication? can this be a feature of piwik or somehow documented etc.?


#3

Hey Matt

I’ll take a better look at the docs you linked and see what we’d be able to contribute. I’m a little hesitant to recommend our configuration only because its fairly new and I definitely don’t think I’ve had enough data to justify it. I believe there is a lot of value for Piwik to allow multiple database connections and in turn replication, but would like to get a better proof of concept.

My goal was to transition from our legacy in-house tracking platform to Piwik, which was far more robust. Therefore, I needed to allow for tracking of 30+ sites, 500k or more page views a day, with the real challenge of keeping reporting as close to real-time as possible. We are hosting with Amazon and we already had a couple dedicated database servers to work with, so I had set up three databases. A Master, Archive, and Slave. The idea was to separate external (tracking) connections, from overhead caused by the archiving process, from our own internal connections.

I created a solution yesterday for switching between a Master and Slave (or defaulting to the standard database connection if a master is not specified). So far, it appears to be working - however the only thing I am not sure about is how it effects Profiling and how it effects the event hooks. Can you have more then one object tied to a single Hook (ie Reporting.createDatabase)? As for Profiling, I believe those functions would need to be rewritten to also check if a master db was set and either displaying each database connection independently, or aggregating the Profiling data?


(Matthieu Aubry) #4

it’s OK if profiling does not work for now, as to the hook question let me know if you have any more specifics


(Jon) #5

Hi,

just found your message regarding splite read writes to different servers.
Although it’s an old topic, do you still have this setup.
We have a 3 three server cluster with one master.
Our idea is to send all write queries to the master and all read queries to a load balancer for the three server.

Any help is welcomed!

Thank you in advance