Random segment data not on reports


#1

About a week ago everything seemed fine, we are evaluating Piwik to see if it will work for our needs and are tracking 4M+ hits a day 5 sites total.

One site has 34 segments to break up the data for different sections of the site (www.blah.com/one/, www.blah.com/two/, www.blah.com/three/, etc) and all of a sudden random parts of random segments in the report just don't work. Some segments just don't have any data for random days, and when I go to pull up the segment statistics it tries to do a live query (which takes in the area of an hour just to query a day)

Does anyone have any idea why it may do this? I get no errors at all when importing or archiving data every day. I tried reimporting a month of data and that month still gets the same issue- random chunks of data just aren't there. For example it may have the segment data for November 15th for www.blah.com/one/ but not www.blah.com/two/, however it will have data for November 16th for www.blah.com/two/ and not www.blah.com/one/.

I am at my wits end trying to figure out what is wrong. Is it a MySQL limitation? is it PHP? I am getting no error codes at all from anything, I just look at my system and see MYSQL using up all kinds of resources when it looks for the missing segmentation data. and if I do a show processlist; query in the DB, it shows me it is churning away hard doing prepares and other queries.

I do daily reprocessing of my reports.


(Peterbo) #2

Hi,

4 Mio actions per day is pretty much. It seems like the archiving process doesn’t have enough server ressources on days where no data was archived. Please have a look into your error-logs and change your settings accordingly.

Same applies to the segmentation data. Please make sure you enable preprocessing for your segments of interest. Then, your segments don’t have to be calculated on the fly (which takes very long at that scale). See the option "segmented reports are pre-processed (faster, requires archive.php cron) " when setting up your segment in the segment editor.

At that scale, you should run in an optimized environment - please contact our professional services department at services@piwik.org if you need assistance.


#3

Hi Peter,

I am preprocessing reports for all data- even the segment data. My issue is that random data just disappears after I preprocess. This server is dedicated just for PIWIK, the MYSQL database has 10GB of ram for it and piwik/php has 8GB just for it.

I cant find any pattern at all for what data isnt getting reports generated, as I said- it is random segments at random days. I create reports for everything daily at about 2:30 AM and it is all finished by 5 or 6 AM. I have tuned the crap out of the DB and PHP- but I dont think it is a “tuning” issue because there are no other problems with slowness or anything.


(Matthieu Aubry) #4

Data that disappears: can you set higher timeout in Settings>General Settings ?

Are you using 2.0 beta ? we fixed so many bugs around this so please upgrade, it may really work better there! :slight_smile:


#5

Matt, I am testing to use this in a production environment- a beta version simply wont do for this purpose, it must be a stable, tested build. my timeout is fine and the cron archiving completes within 3 or 4 hours without any errors. It just seems piwik doesn’t like to track so many segments and it seems as time goes on it gets worse and worse- more days from more segments just not getting reports created. I can manually query data in the database when I log in, but API calls are triggering rearchiving even when I have browser archiving triggering set to disabled.

browser_archiving_disabled_enforce = 1
enable_browser_archiving_triggering = 0

It seems both of these are being ignored. This wouldn’t be nearly so bad if it tried to pull up data and just told me it didn’t have any- but it throttles the server costing precious cycles just to take 2 or so hours to pull up a segment that has about 100 hits or so.


(Matthieu Aubry) #6

1.12 is a stable build, but in fact much less stable than 2.0-beta. We have worked last 6 months on it, trust me. It’s amazing and fixes so many bugs. Please try with beta, or we can’t comment on bugs (but still happy to talk)