Requirements for High Traffic Site

Hi!

We are testing Piwik on a few Websites of ours, which currently have about 10k Vists per Day.

The Archiving cronjob currently consumes about 1Gb Ram for processing these websites, and the amount of needed Ram seems to steadily grow. (We increase the limit every two weeks…).
I followed the tips on the Piwik-High-Traffic-FAQ, but the memory consumption seems still to be pretty massive.

If piwik satisfies our needs, we think about utilizing it on a few more websites, which would result into approx. 20k vists per Day.

Are there any best-practice Configurations or Recommendations (Hardware-wise) for monitoring such an amount of Users ? Can anybody share his configuration, which works with a website in this scale (I’ve read the Forum post about High-Traffic Websites, but those only cover pure data, but not server configurations) ?

Also, i am unsure about one point:

Does deleting old logs lower the memory consumption of the archiver script ?
I’ve read on the Forum that this will not help, tho the FAQ states that deleting those is recommended.

If old logs will be deleted automatically, what data will be lost ?

I am thankful for any help,

cheers,

mf

PS:

More Detail on current visitor-logs:
Currently: ~10k Visitors / day, ~30k Actions / day
Since Mid-January 2012: 352300 visits, 1032926 actions.

What is your configuration right now ?

  • Public cloud ? Dedicated hosting ?
  • Do the web server and the database share memory ?
  • Are you running piwik with apache2 or nginx ?
  • Do you archive every hour ?

With very few tuning piwik will scale way over your 20k visits.

If you run the archive script every hour, 1Gb seems a bit high to me, there might be something going on.

I am VERY sorry for my late reply, tho I was real busy over the last few weeks.

Let me answer the questions above:

What is your configuration right now ?

  • Intel Xeon L5640 @ 2.27GHz
  • 6Gb Ram
  • Public cloud ? Dedicated hosting ?
  • Do the web server and the database share memory ?

We run piwik, the Database + some (not all) Websites on the same, dedicated, Server.

  • Are you running piwik with apache2 or nginx ?

We run it on apache2. Performance in the Web-Interface is good, if that information matters.

  • Do you archive every hour ?

Yes, we run it every hour.

We think about moving Piwik to a dedicated Server (if necessary), hence it would be good to have some “working” configuration for our sites. Tho, if it is possible to have piwik on the same machine it would surely be better.

Maybe somebody can also give me a (possible) answer to my question above:

Does deleting old logs lower the memory consumption of the archiver script ?
I’ve read on the Forum that this will not help, tho the FAQ states that deleting those is recommended.

If old logs will be deleted automatically, what data will be lost ?

cheers,

mf

Does deleting old logs lower the memory consumption of the archiver script ?
it helps speed of mysql queries a bit, but not critical to have.

If old logs will be deleted automatically, what data will be lost ?
old log data is lost… but not archived reports if you setup automatic archiving script. See the delete logs FAQ.