Matomo_log_action - reprocess contents from logs

Hello Matomo team,

I have seen the invalidating reports and reprocessing capability for the “Numeric” and “Blob” type of Archiving data.

However couldn’t find a way to purge/reprocess matomo_log_action table data.

Our requirement is we have added “Exclude Query Parameters” late in the cycle, and we wanted to correct the existing data in the tables of matomo_log_action and matomo_log_link_visit_action, per the defined exclusion list.

Is this achievable?

Many thanks.

Regds,
Sivakumar

Hello Matomo team, isn’t this a reasonable ask ? Or anything wrong with the expectation. Please help clarify.

Regds,
Sivakumar

xkcd#979

I had the same need (to exclude a non-standard session param) and saw that following the docs (invalidation, re-archiving) didn’t produce the expected result - then looked at the source[1] to determine that the current architecture doesn’t allow it.

It’s too late to exclude query params by the time the request is ingested/tracked/persisted in the log entities. The relation to idactions is already built.

I note this down for others to gauge a route how to manually set sql relations in the schema, you’re looking at setting new relations in the log_link_visit_action table to log_action for:

  • idaction_url
  • idaction_name
  • idaction_url_ref
  • idaction_name_ref

… and then cleanup orphaned entries in the latter table.

As a matomo newbie I asked myself how the logs are persisted, and look at it this way:

SELECT lv.idvisitor,
       lv.visit_last_action_time,
       lva.server_time,
       idaction_url_ref,
       idaction_name_ref,
       idaction_name,
       idaction_url,
       la1.name AS url,
       la2.name AS referrer
FROM log_link_visit_action lva
INNER JOIN log_action la1 ON la1.idaction = lva.idaction_url
INNER JOIN log_action la2 ON la2.idaction = lva.idaction_url_ref
INNER JOIN log_visit lv ON lv.idvisit = lva.idvisit
WHERE url like '%your_exclude_param%'
LIMIT 10;

For me it’s doable as there are only 5-6 overall idactions that matter and only 1 session param to exclude.

[1]: following excludeQueryParametersFromUrl() down into:

  • core/Tracker/PageUrl.php
  • core/Tracker/Action.php
  • core/Tracker/RequestProcessor.php
  • plugins/Events/Actions/ActionEvent.php