I run analytics for a very high traffic website. We have a large in-house-built data warehouse and reporting environment, but development of new metrics is slow. We need an alternative for near real time metrics for newly launched site features while we wait to build that into our system-of-record. An acquaintance at another, even larger traffic website uses Piwik for this and recommended it to me. He didn’t know how much storage they use–their Piwik use is distributed department by department, not comprehensively across the whole site.
I’d really appreciate it if anyone can calculate a rule-of-thumb for storage as a function of page views based on their experience, e.g., divide your total to-date page views by your total MySQL space or just a GB/mm PV ratio.
Also, I couldn’t find much on Piwik on Amazon EC2. Anyone trying that?
Many thanks in advance!