I have seen the invalidating reports and reprocessing capability for the “Numeric” and “Blob” type of Archiving data.
However couldn’t find a way to purge/reprocess matomo_log_action table data.
Our requirement is we have added “Exclude Query Parameters” late in the cycle, and we wanted to correct the existing data in the tables of matomo_log_action and matomo_log_link_visit_action, per the defined exclusion list.
I had the same need (to exclude a non-standard session param) and saw that following the docs (invalidation, re-archiving) didn’t produce the expected result - then looked at the source[1] to determine that the current architecture doesn’t allow it.
It’s too late to exclude query params by the time the request is ingested/tracked/persisted in the log entities. The relation to idactions is already built.
I note this down for others to gauge a route how to manually set sql relations in the schema, you’re looking at setting new relations in the log_link_visit_action table to log_action for:
idaction_url
idaction_name
idaction_url_ref
idaction_name_ref
… and then cleanup orphaned entries in the latter table.
As a matomo newbie I asked myself how the logs are persisted, and look at it this way:
SELECT lv.idvisitor,
lv.visit_last_action_time,
lva.server_time,
idaction_url_ref,
idaction_name_ref,
idaction_name,
idaction_url,
la1.name AS url,
la2.name AS referrer
FROM log_link_visit_action lva
INNER JOIN log_action la1 ON la1.idaction = lva.idaction_url
INNER JOIN log_action la2 ON la2.idaction = lva.idaction_url_ref
INNER JOIN log_visit lv ON lv.idvisit = lva.idvisit
WHERE url like '%your_exclude_param%'
LIMIT 10;
For me it’s doable as there are only 5-6 overall idactions that matter and only 1 session param to exclude.
[1]: following excludeQueryParametersFromUrl() down into: