Past Goal/Funnel report generation ends up in infinite loop

Hello,

We are facing a lot of issues with new Funnel creation triggering past report generation, as mentioned in 2 previous topics:


In top of those 2 issues, we are facing one more:
A new Funnel was created on March 16th at 4:30pm, and past reports generation were triggered during the next cron job.
But some of past report generation for the new Funnel were still running after more than 23h:


You can see that the cron job started at 5pm on March 16th, and was still running at 4pm on March 17th, when we killed it manually.
(previous spikes concern the regular cron job, taking less than 20 minutes)

Here is the logs of the Funnel report that took more that 10 seconds:


Also, keep in ming that the most costly queries are not logged, because they never finished.

We monitored on MaridDB instance to capture the faulty queries, and we could observe that there seem to be an infinite loop:

SELECT  /*+ MAX_EXECUTION_TIME(7200000) */ 
        log_action1.name AS `label`, 
count(distinct log_link_visit_action.idvisit) AS `2`, 
count(distinct log_link_visit_action.idvisitor) AS `1`, 
count(*) AS `3`, 
AVG(log_link_visit_action.product_price) AS `avg_price_viewed`
FROM
        matomo_logtmpsegmentf4938b7b1ce8d76fac6ed880e4f827af AS logtmpsegmentf4938b7b1ce8d76fac6ed880e4f827af INNER JOIN matomo_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = logtmpsegmentf4938b7b1ce8d76fac6ed880e4f827af.idvisit LEFT JOIN matomo_log_action AS log_action1 ON log_action1.idaction = log_link_visit_action.idaction_product_sku
WHERE
        log_link_visit_action.idaction_product_sku is not null
GROUP BY
        log_action1.name
		
SELECT  /*+ MAX_EXECUTION_TIME(7200000) */ 
        log_action1.name AS `label`, 
count(distinct log_link_visit_action.idvisit) AS `2`, 
count(distinct log_link_visit_action.idvisitor) AS `1`, 
count(*) AS `3`, 
AVG(log_link_visit_action.product_price) AS `avg_price_viewed`
FROM
        matomo_logtmpsegmenta04727f7f3b247334d7731c5b6a8cc96 AS logtmpsegmenta04727f7f3b247334d7731c5b6a8cc96 INNER JOIN matomo_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = logtmpsegmenta04727f7f3b247334d7731c5b6a8cc96.idvisit LEFT JOIN matomo_log_action AS log_action1 ON log_action1.idaction = log_link_visit_action.idaction_product_cat4
WHERE
        log_link_visit_action.idaction_product_cat4 is not null
GROUP BY
        log_action1.name
		
SELECT  /*+ MAX_EXECUTION_TIME(7200000) */ 
        log_action1.name AS `label`, 
count(distinct log_link_visit_action.idvisit) AS `2`, 
count(distinct log_link_visit_action.idvisitor) AS `1`, 
count(*) AS `3`, 
AVG(log_link_visit_action.product_price) AS `avg_price_viewed`
FROM
        matomo_logtmpsegmentd038465828697fee7478bc6aeac6c716 AS logtmpsegmentd038465828697fee7478bc6aeac6c716 INNER JOIN matomo_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = logtmpsegmentd038465828697fee7478bc6aeac6c716.idvisit LEFT JOIN matomo_log_action AS log_action1 ON log_action1.idaction = log_link_visit_action.idaction_product_sku
WHERE
        log_link_visit_action.idaction_product_sku is not null
GROUP BY
        log_action1.name 

You can notice that we check a second time log_link_visit_action.idaction_product_sku is not null on a different temporary table. This keep going indefinitely simply changing the temporary table name, and switching WHERE clause between

log_link_visit_action.idaction_product_cat is not null
log_link_visit_action.idaction_product_cat2 is not null
log_link_visit_action.idaction_product_cat3 is not null
log_link_visit_action.idaction_product_cat4 is not null
log_link_visit_action.idaction_product_cat5 is not null
log_link_visit_action.idaction_product_name is not null
log_link_visit_action.idaction_product_sku is not null
...

Could you please assess what could be the root cause of this issue?

FYI, we are using Matomo 4.2.1 and Funnel 4.0.5

Thanks and Regards

Matomo 4 by default will process the last 6 months for data for newly created reports. You can set this to zero in your config: https://github.com/matomo-org/matomo/blob/4.x-dev/config/global.ini.php#L770

Hi, thanks for the fast answer.
I tried to use rearchive_reports_in_past_last_n_months with:
rearchive_reports_in_past_last_n_months = last0

But them the cron job fails with
Failed to create invalidations for report re-archiving (idSites = ["1"], pluginName = Funnels, report = 30, startDate = ): Error: Call to a member function isEarlier() on null in /var/www/html/core/Archive/ArchiveInvalidator.php:494

I’ve been reading the code the bit,

$lastNMonthsToInvalidate will be equal to 0, and empty(0) is true, meaning getEarliestDateToRearchive() will return null instead of the current month, which then lead to the exception.

Also, could be good to update the documentation, to mention the past report generation :slight_smile:

Concerning the infinite loop, I’m still a bit concern, because during the first cron job on the first day of a month, some Funnel reports are generated. Could you please double check what could be the root cause?

Thanks and Regards

The note says Set to 0 to disable the feature have your tried setting it 0 rather than last0?

Both fail:
rearchive_reports_in_past_last_n_months = last0
and
rearchive_reports_in_past_last_n_months = 0

I’m currently using rearchive_reports_in_past_last_n_months = last1, which triggers past reports generation since February 1st as expected, but we are still facing the infinite loop.

Moreover, I realized that updating an existing Funnel will invalidate all existing reports. Thus, I definitely need to use last1 or maybe even last2 if I don’t want to impact the experience of my users.

Concerning the infinite loop, we have been deploying performance insight on AWS RDS to try to monitor the queries (most of them are the log_link_visit_action queries I posted in the first message of this topic). There is actually no query that takes more than 1 second, they are all very fast. But the process is still on-going after more than 25 hours… It seems to me that Matomo keeps doing the same queries forever.

FYI, for my Funnel, each day report since February 1st took only a couple of minutes to generate successfully, and each week took around 10 minutes. But it seems that the month of February is taking ages, and is still not done (25 hours and counting…).

Could we please keep this current topic to discuss the infinite loop, and discuss anything related to rearchive_reports_in_past_last_n_months in New Funnel triggers past report generation

The report for the month of February finally finished after 27 hours, thus it is not an infinite loop.
There is still a problem to fix on Matomo, as the order of magnitude between the time to generate a weekly report, and the time to generate a monthly report is huge!
I opened an issue on github:

The discussion seems to continue in