Logging db queries / issues with lots of sites?

Hey folks,

after good results with some sites I added the remaining 5800 sites to piwik.
Its not adding a lot of visits but just a lot of minor “sites”.
Since then the database server is running at high load when the archive cronjob is running every hour.

Is there any way to log just the queries done on archiving to see whats going on?
Is there anything bad regarding the number of sites when archiving is done?

Btw. that is on 0.5.4. I will try 0.5.5 tonight.

Thanks in advance,

Thomas

Looking at the top output I found that its php with the high load running for 25 minutes and with 8GB memory usage.
Is that expected? Anything I could do about it?

Just upgraded to 0.5.5 and the same issue remains.
Funny thing is that the archiving makes the mysql database run at something around 1k queries per second.

I still doubt that this is the intended behaviour.
Shouldn’t it only work on new visits and add those to the database? To me it looks like its going through all days and months of all existing sites.
I didn’t look at the code yet but I might, once I find time for it.

I’m open for any suggestions regarding that issue.

thomas

Thomas,
we haven’t “officially” tested Piwik with more than 1,000 websites yet. I know some users have done it. The high memory usage in PHP is caused because there are memory leaks during archiving see http://dev.piwik.org/trac/ticket/766

In particular, we should provide a way to trigger each website’s archiving in a separate HTTP request; the archive.sh job should be updated to do this. I created a ticket at: http://dev.piwik.org/trac/ticket/1227

stay tuned on the ticket, i’ll probably take a look at it very soon, and you can then try the change out and see if it fixes the issue. Making Piwik work with your kind of setup is very important for us!

Matthieu,

thanks a lot. I subscribed to that ticket!
Triggering multiple http requests could help a lot in my environment as the php is distributed over multiple servers (with nginx as a frontend running on another server) hence the archiving would be handled by multiple servers that way too.

Regards,

thomas

actually it won’t trigger multiple http requests, my bad, as the sript is using php cli

Thomas,
I committed the patch for the ticket to trigger multiple requests for each website, can you try it out on your setup and see how it helps? http://dev.piwik.org/trac/changeset/2025

let me know here in the ticket comments if that solved it for you. thx