Packets lost during tracking (High Traffic)


I am using Piwik on “high” traffic websites, ~~350k visits / day.

I’ve been confronted to an issue I can’t solve. Sometimes I get this error, making me lose some data. It happened a lot when my infrastructure was “poor”, but it continues to happen while my infrastructure should be a lot more than enough. I hope I can get some help here and maybe solve other’s problems later.

PHP Warning: mysqli_connect(): (HY000/2013): Lost connection to MySQL server at 'reading initial communication packet', system error: 0 in [...]/Mysql.php on line 75

I did a lot of researches, I tryed to upgrade the RAM of my database, set php memory_limit to -1, changed wait_timeout and max_allowed_packet values and also upgraded my infrastructure :

I was using Google Cloud with :

  • 1x Load Balancer
  • 2x Google Compute Engine n1-standard-1 to handle tracking calls
  • 1x Google Compute Engine n1-standard-1 to do archiving & use dashboard
  • 1x Google Cloud SQL 1G RAM for the piwik database hosting

I had a lot of data loss (about 29k requests in 1 day), so I upgraded it :

  • 1x Load Balancer
  • 5x Google Compute Engine n1-standard-2 to handle tracking calls
  • 1x Google Compute Engine n1-standard-4 to do archiving & use dashboard (I have issue with archiving on websites with 220K+ visits/day, but that’s another problem less important for the moment)
  • 1x Google Cloud SQL 4G RAM for the piwik database hosting

Now I still have sometimes some packet loss, a lot less but I’m not sure I fixed the problem, maybe the traffic is just less intense these last days, but still I find that my infrastructure should be “overkill” so either it should solve completely the issue, or it was not the main problem.
Since I continue to see some errors on my logs, about 10 per day, I’m not sure the problem is completely solved, and I have to find the real reason before downgrading my infrastructure to its proper needs.

If anybody has any clue on the solution to apply to this problem, I would appreciate a lot.


Hmmm that seems strange. Unfortunately I don’t have a full on solution, but I would agree that the decrease in errors could be due to a slowdown in traffic.

Have you been able to draw any firm numerical correlations between the packet loss and heavy traffic??