High traffic piwik server

Hello piwik team,

First of all I would like to thank the PIWIK team for their great work

We are having big issues with the performance in our PIWIK app:

  • We cant see graps in the “all websites” listing. We get an error in the browser console, that we cant fix: “Resource interpreted as Image but transferred with MIME type text/html:”

  • We are not able to filter “all websites” listing by “Date Range” between days in different months. We get a “Oops… there was a problem during the request. Maybe the ser…” Error. In fact, we have never use the “date range” option because we have got same red message.

Our technical info:

  • More than 100 sites

  • Database usage:

    • Report tables: Total 126.5 G 12.4 G 160,593,894
    • Reports: Total 100,204,527 45.8 G
    • Trackers: Total 1.8 G 811.7 M 17,242,763
    • Metric tables: Total 134 M 264.9 M 989,557
    • Metrics: Total 795,592 361.8 M
    • Other tables: Total 795,592 361.8 M

In report tables, there is something strange. There are two months (2014_01 and 2014_06) with big size in compare others:

pwk_archive_blob_2013_11 403 M 48.7 M 749,407
pwk_archive_blob_2013_12 412 M 47.7 M 961,210

pwk_archive_blob_2014_01 73.5 G 6.9 G 73,876,651

pwk_archive_blob_2014_02 475 M 51.8 M 1,196,791
pwk_archive_blob_2014_03 596 M 63.8 M 1,409,588
pwk_archive_blob_2014_04 534 M 56.8 M 1,126,969
pwk_archive_blob_2014_05 562 M 59.8 M 1,336,949

pwk_archive_blob_2014_06 42 G 4.1 G 65,830,672

pwk_archive_blob_2014_07 1.9 G 269 M 1,365,290
pwk_archive_blob_2014_08 2 G 212 M 2,771,651

Is there any script to simplify/clean those months?

  • Number of daily page views: ~ 15.5 M

  • The archive logs goes back in time from 2010

  • The server OS is: Ubuntu 14.04 in a 8 core Xeon E5-1620 with 64GB RAM and 600 GB SSD Hardware RAID1.

  • We are using: PHP ver: 5.5.16 and MYSQL ver: 5.6.19 and the database tables are InooDB.

  • Our cron job archive using the updated PIWIK command line tool: console core:archive and spends 4 minutes

  • We currently have the latest version of PIWIK: 2.5

Any suggestions to improve this environment?

We think piwik app is a great app for us but bit slow comparing with Google analytics.
We have been developing our SaaS framework called Opennemas for these last 5 years and we use piwik everyday.

We always ask ourselves same question… why Google analytics is so fast comparing our piwik. I think we have a best hardware performance, but… what we need more to get same quick results as Analytics… maybe big data software in piwik???

Thanks in advance people!