I’m experimenting with Piwik for use in a multi-domain, high-traffic production environment. I’ve seen very little discussion of what makes ‘high-traffic’ with regard to Piwik and/or how it performs under pressure.
We serve upwards of 6M page views per day. End-user experience is critical - any noticeable slowdown in site performance is unacceptable.
Any comments that anyone has to further the discussion are appreciated.
UPDATE: see the post: 301 Moved Permanently for more info about scalable high traffic performance piwik
I am preparing an implementation plan by Piwik. The site scale is million-level pv
every day.
Now,Performance is the focus of attention。Please give suggestions on some performance。Thanks.
1 - We run sites in a load-balanced environment. It occurred to me that the installer writes files. Is that limited to /config/config.ini.php and /config/manifest.inc.php?
2 - Writing flat files to the server (./tmp/sessions/, ./tmp/templates_c/) is out of the question. Has anyone developed a work-around for this?
3 - I’d estimate that, with our current traffic patterns (and looking at how many inserts are generated for each ‘visit’), piwik_log_visit/link_visit_action/action tables will grow to somewhere around 20-25M rows for a month. I read elsewhere in the forums that those tables don’t flush. Is that something that we’ll have to implement ourselves?
re: tmp/sessions - that’s a web server configuration issue; in your environment, you’d likely look at changing php’s session handler (default is “files”) to something like memcached; in a future version (post-1.0), we may implement a database-backed store for the session handler
I wrote the cache on disk code in such a way that it should allow to easily add a memcache datastore (ie. create a new class that implements the interface). I’m pretty sure this would be a cheap development to add memcache support for cached tracking data (couple days maximum).
We do have ten Memcache(d) servers out there… I have a couple of programmers looking at moving sessions to those.
Nice, let us know your plans regarding this, this should be in Piwik core style_emoticons/<#EMO_DIR#>/smile.gif (ensuring we will maintain it, anyone can benefit, etc.]Did anyone have any opinions on how Piwik is going to function with piwik_log_* tables were rows number in the tens of millions?[/quote]
It is basically down to MySQL performance at this scale. Considering you can safely purge data older than 31 days, this should not be problem in your case. The highest piwik user publicly know is Maciej with 20M pages per month (http://lists.piwik.org/pipermail/piwik-hackers/2010-April/000859.html) to my knowledge, he doesn’t purge the DB yet.
My definition of easy is somewhere between a one-line fix and something I can code in an hour with one hand, while holding an infant in the other arm. style_emoticons/<#EMO_DIR#>/wink.gif
If you recall, as part of http://dev.piwik.org/trac/ticket/806, I tested Piwik with the session save handler set to files, sqlite, or memcache. These are configred outside of Piwik.
Also, I didn’t test mm (memory mapped) as it doesn’t support Windows, is no longer being maintained, and reportedly has concurrency issues.
I’m running Piwik also with load-balanced php backends and got no problem with only local storage.
The only problem are the sessions right now which requires me to login twice but just using memcached for session storage might skip this problem.
I know, your requirement is different but just wanted to throw in another possibility.