I need to do this API query for ~10k different segments (they are all pageUrl==http://example.org/some/url
):
module: "API",
method: "VisitsSummary.getUniqueVisitors",
token_auth: "*******************",
period: "month",
date: "2013-04-01,today",
idSite: 1,
filter_limit: -1,
segment: "pageUrl=="+URL,
format: "json"
But this takes a VERY long time to run. I’ve run this with console climulti:request
to avoid timeouts. Once it completes, it’s quite fast to run the same query. Looking at the debug logs, I see that Piwik archives the query for each day, week and month in the period.
DEBUG VisitsSummary[2016-08-31 20:26:49] week archive, idSite = 1 (definitive archive), segment pageUrl==http://example.org/some/url', report = 'VisitsSummary', UTC datetime [2016-08-01 03:00:00 -> 2016-08-08 02:59:59]
I have two questions related to this:
-
Do this queries stay archived indefinetely? Or is there a limit to the amount of different segments that get archived? I need to do this queries continuously, and I need them to run fast.
-
Is there something I can do to speed up this queries? Even if this queries do stay archived forever, it will take several weeks to run them for all the URLs that I care about.
Regarding this last question, I activated MySQL’s slow query log. There are many of this queries:
# Time: 160831 17:34:51
# User@Host: piwik[piwik] @ localhost [127.0.0.1]
# Thread_id: 2505 Schema: piwik QC_hit: No
# Query_time: 2.238859 Lock_time: 0.000110 Rows_sent: 1 Rows_examined: 9734449
SET timestamp=1472675691;
SELECT /* Core */
count(distinct log_inner.idvisitor) AS `1`,
count(*) AS `2`,
sum(log_inner.visit_total_actions) AS `3`,
max(log_inner.visit_total_actions) AS `4`,
sum(log_inner.visit_total_time) AS `5`,
sum(case log_inner.visit_total_actions when 1 then 1 when 0 then 1 else 0 end) AS `6`,
sum(case log_inner.visit_goal_converted when 1 then 1 else 0 end) AS `7`,
count(distinct log_inner.user_id) AS `39`
FROM
(
SELECT
log_visit.idvisitor,
log_visit.visit_total_actions,
log_visit.visit_total_time,
log_visit.visit_goal_converted,
log_visit.user_id
FROM
piwik_log_visit AS log_visit
LEFT JOIN piwik_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = log_visit.idvisit
WHERE
( log_visit.visit_last_action_time >= '2013-10-10 03:00:00'
AND log_visit.visit_last_action_time <= '2013-10-11 02:59:59'
AND log_visit.idsite IN ('1') )
AND
( log_link_visit_action.idaction_url = '1029' )
GROUP BY
log_visit.idvisit
ORDER BY
NULL
) AS log_inner;
Is there any index I can add to the database to speed things up?