How to improve DB performance due to slow queries via core:archive cronjob?


#1

I’m the DBA for a piwik installation. My ultimate question is: what can I do to improve performance and reduce CPU utilization? Every 2 hours, we run

/usr/bin/php /var/www/html/piwik/console core:archive --concurrent-requests-per-website=8  --url=http://localhost/pwk 

Corresponding to this cronjob, we get numerous slow queries, ranging from 2 to 11 seconds, average 5 seconds, on a very fast 8-CPU host with plenty of memory. Here is a sample query:

 SELECT count(distinct log_inner.idvisitor) AS `1`,  count(distinct log_inner.user_id) AS `39` FROM  (  SELECT log_visit.idvisitor,  log_visit.user_id FROM piwik_log_visit AS log_visit LEFT JOIN piwik_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = log_visit.idvisit WHERE ( log_visit.visit_last_action_time >= '2015-10-31 23:00:00' AND log_visit.visit_last_action_time <= '2015-11-30 22:59:59' AND log_visit.idsite IN ('6') ) AND ( (log_visit.visitor_returning = '1' OR log_visit.visitor_returning = '2') AND ( log_link_visit_action.idaction_url IN (SELECT idaction FROM piwik_log_action WHERE ( name LIKE CONCAT('%', 'OURSITE/DEPARTMENT', '%')  AND type = 1 )) ) ) GROUP BY log_visit.idvisit ORDER BY NULL ) AS log_inner;

When I run mysql’s explain extended, I get the following:

| id | select_type | table                 | partitions | type   | possible_keys                                                                     | key                          | key_len | ref                                      | rows    | filtered | Extra                                               |
+----+-------------+-----------------------+------------+--------+-----------------------------------------------------------------------------------+------------------------------+---------+------------------------------------------+---------+----------+-----------------------------------------------------+
|  1 | PRIMARY     | <derived2>            | NULL       | ALL    | NULL                                                                              | NULL                         | NULL    | NULL                                     |    6812 |   100.00 | NULL                                                |
|  2 | DERIVED     | log_visit             | NULL       | ref    | PRIMARY,index_idsite_config_datetime,index_idsite_datetime,index_idsite_idvisitor | index_idsite_config_datetime | 4       | const                                    | 1063958 |     2.11 | Using index condition; Using where; Using temporary |
|  2 | DERIVED     | log_link_visit_action | NULL       | ref    | index_idvisit                                                                     | index_idvisit                | 4       | piwik.log_visit.idvisit                  |       5 |   100.00 | Using where                                         |
|  2 | DERIVED     | piwik_log_action      | NULL       | eq_ref | PRIMARY,index_type_hash                                                           | PRIMARY                      | 4       | piwik.log_link_visit_action.idaction_url |       1 |     5.55 | Using where                                         |
+----+-------------+-----------------------+------------+--------+-----------------------------------------------------------------------------------+------------------------------+---------+------------------------------------------+---------+----------+-----------------------------------------------------+

piwik_log_link_visit_action has 11429730 rows. piwik_log_action has 164607 rows. piwik_log_visit has 2159232 rows. So, is there something I can do here at the database level?