has anyone had experience with Piwik on MySQL 5.5.x?
I’m actually using MariaDB 5.5.x but it should be similar.
I don’t see any worse behaviour in the tracking process but my archiving time more than doubled from 250 minutes to 650 minutes. Looking at my munin output I see that the number of temporary tables (which is regularly very high during archiving) has gone down as if it can’t handle as many temporary tables per second as before.
I don’t think that “more memory” is the solution. When the problem started I changed the innodb_buffer_pool from 2GB to 9GB (my tables are in innodb for backup and scalability reasons) and it didn’t give any help.
Temporary tables created is just the mysql variable Created_tmp_tables over time (5 minute interval to be exact).
For now I can just assume that the best performing mysql for piwik in my case was mysql-5.1 but I already upgraded completely :-(.
As a followup.
I see the following query (with different dates and idsites) a lot of time in the processlist when archiving runs.
It takes 0.13 - 0.15 seconds most of the time as you can see below.
But if I remove the DESC on the sorting its running at 0.00 seconds also seen below (repeatable).
As that query returns only 1 item anyway - why is a sorting needed at all? Could it be removed for higher archiving performance?
SELECT idarchive, value, name, date1 as startDate FROM piwik_archive_numeric_2012_04 WHERE idsite = '4982' AND date1 = '2012-04-03' AND date2 = '2012-04-03' AND period = '1' AND ( (name = 'done' AND value = 1) OR (name = 'done' AND value = 3) OR name = 'nb_visits') AND ts_archived >= '2012-04-03 21:59:59' ORDER BY ts_archived DESC
-> ;
+-----------+-------+------+------------+
| idarchive | value | name | startDate |
+-----------+-------+------+------------+
| 106890 | 1 | done | 2012-04-03 |
+-----------+-------+------+------------+
1 row in set (0.13 sec)
SELECT idarchive, value, name, date1 as startDate FROM piwik_archive_numeric_2012_04 WHERE idsite = '4982' AND date1 = '2012-04-03' AND date2 = '2012-04-03' AND period = '1' AND ( (name = 'done' AND value = 1) OR (name = 'done' AND value = 3) OR name = 'nb_visits') AND ts_archived >= '2012-04-03 21:59:59' ORDER BY ts_archived;
+-----------+-------+------+------------+
| idarchive | value | name | startDate |
+-----------+-------+------+------------+
| 106890 | 1 | done | 2012-04-03 |
+-----------+-------+------+------------+
1 row in set (0.00 sec)
Edit: Maybe I can answer the question already from looking at the code: as the function is quite generic and used for lots of different periods and it uses the order to get the most recent entry - no, it can’t just be removed, bummer :(.
How many entries should I expect?
The numbers are similar to the other tables seen in the db - check the attached screenshot from phpMyAdmin.
Btw. for testing I changed the code in isArchived to the following - assuming that it would return only one row for daily archiving for one day:
$sqlQuery = " SELECT idarchive, value, name, date1 as startDate
FROM ".$this->tableArchiveNumeric->getTableName()."
WHERE idsite = ?
AND date1 = ?
AND date2 = ?
AND period = ?
AND ( (name = '".$done."' AND value = ".Piwik_ArchiveProcessing::DONE_OK.")
OR (name = '".$done."' AND value = ".Piwik_ArchiveProcessing::DONE_OK_TEMPORARY.")
$sqlSegmentsFindArchiveAllPlugins
OR name = 'nb_visits')
$timeStampWhere";
// updated TS77
// we are guessing here with date1 = date2 and period being 1 (for day) that it returns one row only anyway
if($bindSQL[1] != $bindSQL[2] || $bindSQL[3] != 1) {
$sqlQuery.=" ORDER BY ts_archived DESC";
}
That brought the archiving time for my 7k sites down to 250 minutes again.
Thanks for the suggestion.
I’m gonna try it with the archiving run next night.
From the manual test it also returns in 0.00 seconds but explain shows a filesort:
+------+-------------+-------------------------------+-------+-------------------------------------------------+---------------------------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+-------+-------------------------------------------------+---------------------------+---------+------+------+----------------------------------------------------+
| 1 | SIMPLE | piwik_archive_numeric_2012_04 | range | index_idsite_dates_period,index_period_archived | index_idsite_dates_period | 24 | NULL | 1 | Using index condition; Using where; Using filesort |
+------+-------------+-------------------------------+-------+-------------------------------------------------+---------------------------+---------+------+------+----------------------------------------------------+
That could give problems with a larger resultset. Probably as its not in the same index - but for innodb that index is a secondary index so it always have to do two index lookups (also for the original problem).