Google Import Export - Fails to get me data for few imported posts

HI All,
I have issues with the https://plugins.matomo.org/GoogleAnalyticsImporter

I have a site registered in Matomo for which I have imported data from GoogleAnalyticsImporter

The issues are as follows:

  1. We display Total Views for a given post for which we invoke the Actions.getPageUrl Matomo API

When I specify a specific date eg : 2021-03-29 - I get 3 nb_hits but if I add a date range from 2019-11-12 to 2021-03-30 I get zero - API should have atleast returned 3 as the above mentioned date is well within the date range.

  1. What are the reasons where some URL is being imported (as i can see it listed in Behavior -> Pages) but data is not being imported properly ?

  2. Few POST URL’s are not being imported by Google Importer and hence their Total Views = 0 ? What reasons would have caused skipping such URL’s ?

  3. How can I query the database (which tables) to find missing imports ?

Hi @Gsanil, sorry you’re experiencing this problem, can you see data in week and month periods?

Do you have any INI settings that force disable browser archiving for range periods?

What are the reasons where some URL is being imported (as i can see it listed in Behavior -> Pages) but data is not being imported properly ?

Can you post an example of the URL that is not being imported properly? How many different URLs total are there in your site?

I verified the config.ini.php file and it has ONLY DB settings and Enabled Plugin List - nothing related to browser archiving or caching.

We have an on-premise setup and for the below URL I see data for a s

I get data for this range as mentioned below: 16th Dec 2019 to 18th March 2021

https://analytics.sophiamedia.com/?module=API&method=Actions.getPageUrl&pageUrl=https://healthimpactnews.com/2019/study-water-fluoridation-linked-to-lower-iq-in-children/&idSite=4&period=range&date=2019-12-16,2021-03-18&format=JSON&token_auth=sometoken

However, I do not get response for the date range : 16th Dec 2019 to any date beyond 18th March 2021

https://analytics.sophiamedia.com/?module=API&method=Actions.getPageUrl&pageUrl=https://healthimpactnews.com/2019/study-water-fluoridation-linked-to-lower-iq-in-children/&idSite=4&period=range&date=2019-12-16,2021-03-28&format=JSON&token_auth=sometoken

It should have returned me the previous response atleast. Not making sense to me. I hope you can help me figure out the issue.

Also I observed a few of the URL’s are not imported at all. The import was successful but a few URL’s from that IMPORT date range were skipped by Google Importer plugin. Not sure why ? And it was never reported. Is there a way where the plugin LOGS Failed / Missed / Skipped Imported URL’s ?

@diosmosis Any updates on this one please ?

Hi @gsanil, if you’re seeing data for some ranges but not all, then it’s possible the other ranges are not being archived. Can you run the following SQL and post the result:

SELECT idarchive FROM archive_numeric_2019_12 where name = 'done' and idsite = 4 and date1 = '2019-12-16' and date2 = '2021-03-28' and period = 5;

?

Then using that idarchive, run:

SELECT * FROM archive_numeric_2019_12 where name = 'nb_visits' and idarchive = ?

?

Also I observed a few of the URL’s are not imported at all. The import was successful but a few URL’s from that IMPORT date range were skipped by Google Importer plugin. Not sure why ? And it was never reported. Is there a way where the plugin LOGS Failed / Missed / Skipped Imported URL’s ?

The plugin will log to /path/to/matomo/tmp/logs/gaimportlog.{idSite}.log files. To see if any URL entries weren’t recognizable you’ll have to enable verbose logging when starting the import (there is a checkbox in the UI). There will be a lot of information to sort through. It would probably be useful to start a new import for a single day where a URL is missing and look through the output.

@diosmosis Thanks for your response. Unfortunately, the first query you typed gives me an EMPTY Set.
So could not fire second query. Also, can you please share what you were trying to figure out by asking me to fire these queries? That way I know for my future reference as well. Thanks !

Matomo stores report data in the archive tables. The status of each “archive” is in a row in the archive_numeric table with a name LIKE 'done%'. The first query was looking for the archive for the range that was displaying no data. If you couldn’t find an idarchive, then it means the processing of the report data is never initiated (contrast to finding it, but finding with no report data or with old report data that was never updated). If you have browser archiving for ranges enabled, you should be able to visit a range and have it process the reports if not already processed, then display it.

I would try the following:

  • check that the archiving_range_force_on_browser_request INI config is set to 1
  • see if this happens for other sites or just the GA import site
  • see if this problem is limited to some ranges or affects other periods such as weeks, months, years. (numeric values for these types of periods are 2 for weeks, 3 for months, 4 for years. the 5 in the original query is for range periods)

@diosmosis Sorry for the delay. When I imported a date range 1st March 2019 to 31st March 2019 these are the logs :

INFO [2021-05-06 09:38:20] 1489367 START
INFO [2021-05-06 09:38:20] 1489367 Starting Matomo reports archiving…
INFO [2021-05-06 09:38:20] 1489367 Start processing archives for site 4.
INFO [2021-05-06 09:38:20] 1489367 Will invalidate archived reports for today in site ID = 4’s timezone (2021-05-06 00:00:00).
INFO [2021-05-06 09:38:20] 1489367 Will invalidate archived reports for yesterday in site ID = 4’s timezone (2021-05-05 00:00:00).
INFO [2021-05-06 09:38:24] 1489367 Archived website id 4, period = week, date = 2019-08-05, segment = ‘’, 70837 visits found. Time elapsed: 3.237s
INFO [2021-05-06 09:38:24] 1489367 Archived website id 4, period = week, date = 2019-07-29, segment = ‘’, 69282 visits found. Time elapsed: 3.237s
INFO [2021-05-06 09:38:29] 1489367 Archived website id 4, period = month, date = 2019-08-01, segment = ‘’, 320038 visits found. Time elapsed: 5.044s
INFO [2021-05-06 09:38:39] 1489367 Archived website id 4, period = year, date = 2019-01-01, segment = ‘’, 6203285 visits found. Time elapsed: 9.816s
INFO [2021-05-06 09:38:39] 1489367 Finished archiving for site 4, 4 API requests, Time elapsed: 18.784s [1 / 1 done]
INFO [2021-05-06 09:38:39] 1489367 Done archiving!
INFO [2021-05-06 09:38:39] 1489367 ---------------------------
INFO [2021-05-06 09:38:39] 1489367 SUMMARY
INFO [2021-05-06 09:38:39] 1489367 Processed 4 archives.
INFO [2021-05-06 09:38:39] 1489367 Total API requests: 4
INFO [2021-05-06 09:38:39] 1489367 done: 4 req, 18819 ms, no error
INFO [2021-05-06 09:38:39] 1489367 Time elapsed: 18.820s
INFO [2021-05-06 09:38:39] 1489367 ---------------------------
INFO [2021-05-06 09:38:39] 1489367 SCHEDULED TASKS
INFO [2021-05-06 09:38:39] 1489367 Scheduled tasks are disabled with --disable-scheduled-tasks
Done in Time elapsed: 163.897s. [234 API requests made to GA]
Error: error or warning logs detected, exit 1

The logs are not informative. I do not understand what the error is ? Where can I see the actual error ? What do I infer out of these logs ?

Hi @gsanil, those are the archiving logs not the importer logs. The importer imports individual days and logs to a gaimportlog.$idSite.log file and then runs archiving to aggregate that data into week, month and year reports. This looks like the very end of the import run. Which means there are logs before this that can contain the WARNING or ERROR log that the last line detected.

This warning or error is not likely the cause of your problem in previous message.

Did you try my suggestions?

@diosmosis OK !

I tried your suggestions. I see this issue for ALL the sites imported to Matomo via the GA Importer. I do not see Total Views for OLDER Posts 2011 to 2019

The other options like -

  • archiving_range_force_on_browser_request INI config is set to 1
  • see if this happens for other sites or just the GA import site - YES
  • see if this problem is limited to some ranges or affects other periods such as weeks, months, years. (numeric values for these types of periods are 2 for weeks, 3 for months, 4 for years. the 5 in the original query is for range periods) - YES It affects all date ranges. Not specific to a specific Post or Date Range

see if this happens for other sites or just the GA import site - YES

Does this mean it happens for sites that aren’t imported from GA?

YES It affects all date ranges. Not specific to a specific Post or Date Range

Does this mean it happens for week, month and year periods or just custom ranges? Week, month and year periods have period=week, period=month, period=year set in the request URL, as opposed to period=range.

I have upgraded to 4.3.0 just today and nothing much has changed. Do I need to run some additional commands to see if the posts are still showing Total Views ZERO ?

To answer your queries -

I have ALL Sites that display data from GA - So I am not sure about it. In short, it happens to ALL sites imported via GAImporter Plugin.

This happens for day, month, week, year, range i.e. ALL 5 periods

I have upgraded to 4.3.0 just today and nothing much has changed. Do I need to run some additional commands to see if the posts are still showing Total Views ZERO ?

Can you try invalidating week periods for the data that is missing, then running core:archive for the site you invalidated from?

This happens for day, month, week, year, range i.e. ALL 5 periods

Can you provide screenshots showing this happening for day periods?

The above screenshot is for a SPECIFIC URL from the set of URL’s that show ZERO Views

The above Post was PUBLISHED on 4th March 2019 and GA has the PageViews. However, Matomo does not.

I will get back to you on this shortly.

Also following is my configuration. Is this fine ?

Are you using a segment in this screenshot? I was looking for a screenshot WITHOUT a segment. Do you see data for day periods alone not day periods for a SPECIFIC URL?

@diosmosis

For YEAR

For Day