Calculating page views and goal conversion from raw data

What formula can I use to calculate “page views” and “goal conversions” from the raw data?

I pulled raw data with Live.getLastVisitsDetails and tried using the following formulas, but they do not match aggregate values pulled from the API:

Page views:
Attempted calculation: count(distinct idpageview)
Aggregate data pulled from API: Action.get.nb_pageviews

Goal conversions:
Attempted calculation: sum(goalConversions)
Aggregate data pulled from API: Goals.get.nb_conversions

Relatedly, does Matomo have a metric glossary that includes technical calculations? Thanks!

Hi @inmemoryofer

Page views

You can also count where type="action", but your formula should work either. What is the observed difference (between your calculation and Matomo one)?

Goal conversions

With your method, you’ll get the total number of conversions. If you want to get the number of converted visits: count where goalConversions > 0

More info

You can also get data from the database:
https://developer.matomo.org/guides/database-schema

Hi @heurteph-ei, thanks so much for your response! I’m looking at a single day of data and the observed differences for pages views and goal conversions are as such:

Page Views
Calculated value: 98625
Matomo value: 97370

Goal conversions
Calculated value: 2056
Matomo value: 2134

So the numbers are close. And I feel fairly confident that I have the right data because my calculations for unique visitors (count distinct visitorId) and visits (count distinct idVisit) was spot on.

For page views, I tried filtering page views for type=“action”, which brings down my count a bit (I think because it excludes the “search” action"), but still doesn’t match the Matomo value.

For goal conversions, I can correctly calculate “visits converted to a goal” by taking the sum of visitConverted, but for some reason the count of actual goal conversions is still off.

Do you know of any other nuances for these two metrics that I could be missing? Thanks again for your help!

Hi @inmemoryofer

For Page views, maybe you have sometimes events happening on paveview ID where the pageview does not exist or belongs to the previous visit? Try to check if a visit start by an event instead of a pageview. Or if an event is not attached to a page (you can see it in the events log because the event is at the same level than a page view instead of being sub level…)

For goal conversion, for the day where you extracted data, what is the number of converted visits?

Hoping it is more appropriate to hop in to a topic. I am chasing the exact same problem as @inmemoryofer and finding similar issues. We are moving from using the raw data from Live.getLastVisitsDetails to a database pull and are being extra careful so noticed that page views didn’t match.

We’re presently on Matomo 5.1.1.

For site id 7 on our analytics, for september 22nd, we saw:
53,029 Page Views (Matomo is set in Pacific Time)
40,938 Page Views Using Live.getLastVisitsDetails for Yesterday (originally pulled on Sept 23) and summing the type=“Action” inside the action details. Quite a bit off?

We’re trying to switch to DB though and there’s definite differences in the data in actions vs getLastVisitsDetails so without accounting for UTC vs PDT:
45,688 page views
and with:
45,474 page views

Basic Query:

SELECT count(log_action.type) as event_type FROM log_visit 
LEFT JOIN log_link_visit_action as llva ON log_visit.idvisit = llva.idvisit 
LEFT JOIN log_action ON log_action.idaction = llva.idaction_url 
where log_visit.idsite=7 and log_visit.visit_first_action_time > "2024-09-20 00:00:00" and log_visit.visit_first_action_time < "2024-09-24 00:00:00" 
and llva.server_time > "2024-09-22 00:00:00" and llva.server_time <= "2024-09-23 00:00:00" and log_action.type=1

Timezone adjusted:

SELECT count(log_action.type) as event_type FROM log_visit 
LEFT JOIN log_link_visit_action as llva ON log_visit.idvisit = llva.idvisit 
LEFT JOIN log_action ON log_action.idaction = llva.idaction_url 
where log_visit.idsite=7 and log_visit.visit_first_action_time > TIMESTAMP("2024-09-19 17:00:00") and log_visit.visit_first_action_time <  TIMESTAMP("2024-09-23 17:00:00") 
and llva.server_time > TIMESTAMP("2024-09-21 17:00:00") and llva.server_time <= TIMESTAMP("2024-09-22 17:00:00 UTC") and log_action.type=1;

I’m not sure where to find the aggregated page views, but mostly I am trying to understand what events we expect to sum up as I’m trying to pull at the log_link_visit_action to send some bulk data for visualization.

Is it expected that the number of page_url actions should count up to page views? Or is there something else we expect to add up there?

Hi @philip.bloom
As far as I remember, not sure you need to calculate from log_visit, as log_action contains site ID and tracking date (and you’re right: be careful on the timezone), and Matomo creates a new visit at midnight. This will simplify the query.
Then, to get page views, you have to exclude site searches (as @inmemoryofer did), and probably some other actions. Try to check the type of action linked to the event (in the log_action table) to know if you need to exclude some others.
See also:
https://developer.matomo.org/guides/database-schema

Thank you for the reply. I am getting too few page views though from the action table versus the visual report, not too many, so excluding would only make it more challenging.

Switching though to something like:
SELECT count(log_action.type), log_action.type as event_type FROM log_link_visit_action as llva
LEFT JOIN log_action ON log_action.idaction = llva.idaction_url
where llva.idsite=7
and llva.server_time > TIMESTAMP(“2024-09-21 17:00:00”) and llva.server_time <= TIMESTAMP(“2024-09-22 17:00:00 UTC”) group by event_type;

Gets me:
(10)Event: 174060
(1)Page URL: 45474
(2) Search: 5

I’m not seeing anything that gets me close to 53,029. Is it expected to? My reading of the linked doc suggest that Page Url == Page View, but I may be misunderstanding the action type enum?

Apologies. I was doing timezone conversion wrong. Fixing for that, the numbers line up within -1 - +1 of the hit numbers which is a comfortable level of error.

So Page Url action types does equal page count accurately, when the reports are adjusted for timezones.

1 Like