10,000 websites and 1,000,000 pageviews a day


(SimonK) #1

Hi Everyone,

We are currently looking at Piwik as a possibility for replacing the current logfile/webalizer solution that we use to provide stats for around 10,000 websites, that generate combined pageviews of over 1 million a day.

Has Piwik been tested with this volume of sites/traffic and what sort of hardware could we expect to need to support growth up to at least double those figures?

Any help and advice greatly appreciated!

Regards

Simon


(Matthieu Aubry) #2

I know Piwik has been tested with up to 300k page views and was working OK on a high end server, but I think 1M might be too much. The main problem will be the archiving memory consumption. There is a ticket to fix at: http://dev.piwik.org/trac/ticket/766[/url] and also this one should help a lot with your setup: [url=http://dev.piwik.org/trac/ticket/1227]http://dev.piwik.org/trac/ticket/1227

I would suggest that you slowly build up traffic (starting with 100k eg.) and maybe try to do the change in #1227 (and submit the patch) if you can. We are very interested by your experience. Also, making Piwik work on such scale is definitely part of our Piwik 1.0 roadmap, but we haven’t yet started doing stress testing ourselves at this scale.


(Matthieu Aubry) #3

I committed the fix to the ticket 1227 which should help a lot with your setup, because you have thousands of websites. Let us know how it’s working for you.


(SimonK) #4

Hi Matthieu,

We’re still exploring this and don’t have anything running yet, but thanks for getting back to me so quickly!

Do you know what sort of spec server was used for the 300k pageviews installation?

Was it running comfortably or reaching the limits of the hardware?

What’s the main bottleneck? Memory, CPU or Disk?

We’re definitely interested in taking this further and would be happy to share our experiences with using the software on this sort of scale!

Regards

Simon


(Matthieu Aubry) #5

The first bottleneck always was the memory used during the archiving process (PHP memory). The patch to archive.sh should help a lot with this. After, depending on traffic, other bottlenecks might appear, such as Tracking data get slower, etc. We haven’t made tests so I can’t say for sure, the best way would be to test it. Let us know how it’s working and when problems appear, we’ll definitely want to help fixing them!


(derSpinner) #6

SimonK,

Over the past day (yesterday):

summary statistics for sites that are considered our Piwik:
Visits: 77,385
Pageviews: 699475

Works fine.
The memory usage of about (approximately) 2.5 GB.

CPU - not too critical. A lot of spare capacity remains.
Memory - requires enough.


(Matthieu Aubry) #7

derSpinner, was 2.5GB the memory usage with the new archive.sh patch? I think the new patch http://dev.piwik.org/trac/ticket/1227 would lower this requirement. let us know!


(derSpinner) #8

Thank you. Download - flooded the file. But to say the results can only Monday. : (


(marrylumb) #9

Its very help full for me because few days i am learning about the websites and this site have more information. i should follow your all tips if you have not any problem than please send me link where i get good templates.

Shoebuy Coupon Code


(derSpinner) #10

Somehow, the new script is not considered a site with ID = 1.
Begins “counting” immediately with the 2nd ID.
In what may be the problem?


(Matthieu Aubry) #11

See High traffic Piwik FAQ


(system) #12

this so best…!!!
very much traffic

http://cf6.co/3A89