Reporting API: Get visits for url parts

Hi there,

I am trying to get visits/views of urls by providing a part of this url. Unfortunately I was not able to find a solution by searching through the forum and google.

I know I can filter by a specific URL with Actions.getPageUrl.
But I need to query by parts of an URL. Like this:

URLs I need to match:
https://example.com/company/google/whatever
https://example.com/blog/google/whatever

So I want to find all visits which have “/google” in the URL.

How can I do that with the API?

Anyone an idea? Is there no no “filter by url parts”?

Hi @pointi

You can create and use some segments (segment definitions under the All Visits button):

Be careful, as specified by your need, you’ll see all visits that will go through a page which URL contains \google. The if the user browse a page with \google and another one without, you’ll see both pages in the result…

See more about segments:

hey, thanks for your answer/input!

I have read about segments. But that would mean I need to create a segment for every customer we have. Which are thousands. And I don’t think it is very practical to create so many different segments :-/

Hi @pointi ,
you can also use PHP or whatever language you are using for filtering the URLs.
For example,
1: Get all PageURLs.
2: Pass these URLs to a function.
3: The function will check if pattern match or not.
4: Aggregate the results.

What do you say about this?

No need for a segment per user, but just a segment as I defined, based on the URL. I don’t understand why you would create a segment per user…?

I thought a segment like yours sticks to /google?
So if I need to find visits for /facebook in my data I need another segment for /facebook. Isn’t that the case?

Yes, you’ll need another segment for Facebook…
As your keyword seems to be located anywhere in the URL, I don’t see another kind of mean to do your report.
If this would be located at a specific static place in the URL (eg. if you are looking at the sub-folder-2 in https://example.com/{sub-folder-1}/{sub-folder-2}/w/h/a/t/e/v/e/r), another solution could be possible…

1 Like

actually it would be on the same level.
to be more clear, the structure is as follows:

example. com/folder1/{company}
example. com/folder1/{company}/subfolder1
example. com/folder1/{company}/subfolder2

etc.

I want all visits where {company} is in the url, so it would be always folder 2 in that case.

@heurteph-ei
Can you tell me how?

Hi @pointi
I suggest you use action dimensions… (then if a user visits https://example.com/google/subfolder then https://example.com/facebook/anothersub both of companies will be gathered):

Then you have to collect this dimension It can be done automatically (see above documentation). Or else manually: How do you track? (HTTP API, bare JavaScript with _paq, MTM, another may…?)

1 Like

I used the dimension creation via configuration. But there is still one issue which needs to get resolved,
I can match:
example. com/folder1/{company}/
example. com/folder1/{company}/subfolder1
example. com/folder1/{company}/subfolder2

but not:
example. com/folder1/{company}

I tried to update my regex to match also urls without a trailing slash/subfolder like this:
/folder1/([a-z0-9-]*)($|/(.*))

but I can only use one group in the expression (“You need to group exactly one part of the regular expression inside round brackets”)

I am using the javascript tracker but due to my setup I can not easily pass the custom dimensions via code.

EDIT:
Also not working with non capturing regex groups like this:
/folder1/([a-z0-9-]*)(?:$|/.*)

What if you do:
/[^/]+/([^/]+)
:question:
( or even example[.]com/[^/]+/([^/]+) )

I tried it at https://regex101.com/ but it does not match my urls.
example. com/folder1/{company}
example. com/folder1/{company}/
example. com/folder1/{company}/subfolder1
example. com/folder1/{company}/subfolder2

They should all return {company}

EDIT:
okay I got the regex working with this:
/folder1/([^/]+)[^/]?

Any way to test that if matomo does it correctly, or do I need to see the data?

Very strange, it works on my tests (escaping slashes chars)

Not sure this one will work, as the last match (not in the capture group) means zero or one character that is not slash…

For testing… Send some tracking pages to your Matomo:
https://your-matomo/matomo.php?action_name=TEST&idsite=1&rec=1&r=545786&url=https%3A%2F%2Fexample.com%2Fgoogle%2Fsubfolder&_id=8d696f1190b79fdd&_idn=0&send_image=1&pv_id=otBvOR

Adapt the idsite, change r (random) and pv_id (page view ID) at each request, and don’t forget to update url for your tests. Then check the result in the real-time of Matomo (look at the action dimension hovering the page view).
After your tests, you can search your user (should be user ID 8d696f1190b79fdd, or else search on the TEST page name) and delete it in the Administration > Privacy > GDPR Tools.

strange, you are right. I tested it again and it worked now. Sorry!

Thanks for putting so much effort in it.
My regex seems to fetch the data also the way I wanted it but yours is cleaner and safer.

1 Like