I am trying to analyse the use of filters in an online search. Matomo is used as a logfile analysis tool, so the way to go is to analyse the URLs with their parameters.
Here’s a typical URL that is formed for a very simple search with just one filter:
DOMAIN/de/suchergebnisse?q=&tags=inspireidentifiziert
The expression to capture this:
.*/suchergebnisse\?.*(tags=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
Matomo does a good job here and the report reads:
tags=inspireidentifiziert
However, it gets more complicated when more than one filter of the same class is used. The URL that is formed then reads:
DOMAIN/de/suchergebnisse?tags%5B0%5D=besch%C3%A4ftigte&tags%5B1%5D=entgel-te&tags%5B2%5D=umsatz&tags%5B3%5D=betriebe&tags%5B4%5D=geleistete+arbeitsstunden
To capture these tags I use multiple expressions because only one capture group per expression is possible:
.*/suchergebnisse\?.*(tags%5B0%5D=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
.*/suchergebnisse\?.*(tags%5B1%5D=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
.*/suchergebnisse\?.*(tags%5B2%5D=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
.*/suchergebnisse\?.*(tags%5B3%5D=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
.*/suchergebnisse\?.*(tags%5B4%5D=[A-Za-z0-9äÄöÖüÜß%.,\-\_\+]+).*
In this instance, Matomo only uses the first one and disregards the rest. It only captures:
tags[0]=beschäftigte
The other 4 uses of tags are disregarded and thus lost for the report.
I have tested the expressions in RegEx 101 (https://regex101.com/) and the capture works. So I can only surmise that there is something in Matomo that stops the analysis if the string is too similar.
Does anyone here have more information?