VisitsSummary is very slow. How to speed up?

I need to do this API query for ~10k different segments (they are all pageUrl==http://example.org/some/url ):

  module: "API",
  method: "VisitsSummary.getUniqueVisitors",
  token_auth: "*******************",
  period: "month",
  date: "2013-04-01,today",
  idSite: 1,
  filter_limit: -1,
  segment: "pageUrl=="+URL,
  format: "json"

But this takes a VERY long time to run. I’ve run this with console climulti:request to avoid timeouts. Once it completes, it’s quite fast to run the same query. Looking at the debug logs, I see that Piwik archives the query for each day, week and month in the period.

DEBUG VisitsSummary[2016-08-31 20:26:49] week archive, idSite = 1 (definitive archive), segment pageUrl==http://example.org/some/url', report = 'VisitsSummary', UTC datetime [2016-08-01 03:00:00 -> 2016-08-08 02:59:59]

I have two questions related to this:

  1. Do this queries stay archived indefinetely? Or is there a limit to the amount of different segments that get archived? I need to do this queries continuously, and I need them to run fast.

  2. Is there something I can do to speed up this queries? Even if this queries do stay archived forever, it will take several weeks to run them for all the URLs that I care about.

Regarding this last question, I activated MySQL’s slow query log. There are many of this queries:

# Time: 160831 17:34:51
# User@Host: piwik[piwik] @ localhost [127.0.0.1]
# Thread_id: 2505  Schema: piwik  QC_hit: No
# Query_time: 2.238859  Lock_time: 0.000110  Rows_sent: 1  Rows_examined: 9734449
SET timestamp=1472675691;
SELECT /* Core */
				count(distinct log_inner.idvisitor) AS `1`, 
			count(*) AS `2`, 
			sum(log_inner.visit_total_actions) AS `3`, 
			max(log_inner.visit_total_actions) AS `4`, 
			sum(log_inner.visit_total_time) AS `5`, 
			sum(case log_inner.visit_total_actions when 1 then 1 when 0 then 1 else 0 end) AS `6`, 
			sum(case log_inner.visit_goal_converted when 1 then 1 else 0 end) AS `7`, 
			count(distinct log_inner.user_id) AS `39`
			FROM
				
        (
            
			SELECT
				log_visit.idvisitor, 
log_visit.visit_total_actions, 
log_visit.visit_total_time, 
log_visit.visit_goal_converted, 
log_visit.user_id
			FROM
				piwik_log_visit AS log_visit
				LEFT JOIN piwik_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = log_visit.idvisit
			WHERE
				( log_visit.visit_last_action_time >= '2013-10-10 03:00:00'
				AND log_visit.visit_last_action_time <= '2013-10-11 02:59:59'
				AND log_visit.idsite IN ('1') )
                AND
                ( log_link_visit_action.idaction_url = '1029' )
			GROUP BY
				log_visit.idvisit
			ORDER BY
				NULL
        ) AS log_inner;

Is there any index I can add to the database to speed things up?

OK, so… adding and index to idaction_url column in log_link_visit_action table helped a lot. :slight_smile:

1 Like

Indeed,

CREATE INDEX `idaction_url` ON matomo_log_link_visit_action (idaction_url);

worked for us too.

We also added :

CREATE INDEX `index_idsite_idvisitor_user_id` ON matomo_log_visit (idsite, idvisitor, user_id, visit_last_action_time DESC)