Now the server had finished the reprocessing. For this I look at the matomo_archive_invalidations-table, also this information is visible in system check:
Total Invalidation Count: 114680
I was very surprised that this number goes from nearly zero to around 140.000. I don’t invalidated the historical reports as before but I created some new segments and also changend some existing custom reports (and unlocked the reports for this).
Is this really so much work for matomo/the server just by creating about 5 new segments and 5 new custom reports? Our server needs 3 till 5 days to handle this.
Is this normal or is the a problem/mistake I made? I do I missunderstand anything in this process?
[Debug]
; if set to 1, the archiving process will always be triggered, even if the archive has already been computed
; this is useful when making changes to the archiving code so we can force the archiving process
always_archive_data_period = 0;
always_archive_data_day = 0;
; Force archiving Custom date range (without re-archiving sub-periods used to process this date range)
always_archive_data_range = 0;
/.../
[General]
; When archiving segments for the first time, this determines the oldest date that will be archived.
; This option can be used to avoid archiving (for instance) the lastN years for every new segment.
; Valid option values include: "beginning_of_time" (start date of archiving will not be changed)
; "segment_last_edit_time" (start date of archiving will be the earliest last edit date found,
; if none is found, the created date is used)
; "segment_creation_time" (start date of archiving will be the creation date of the segment)
; editLastN where N is an integer (eg "editLast10" to archive for 10 days before the segment last edit date)
; lastN where N is an integer (eg "last10" to archive for 10 days before the segment creation date)
process_new_segments_from = "beginning_of_time"
The debug-options in configuration have all the value 0.
The value for “process_new_segments_from” is “beginning_of_time”.
Currently the system is again at the limit. I created 6 new custom reports and a new segment and this leads to more thank 200k “invalidation counts” which the system is handeling now very slowly…
Hope you can tell me and help me how to improve this!
I don’t know. Where can I see this? For the site we need the custom reports/segments for we track since beginning of 2022. But the matomo instance is older. For another page we track the data for more year. I think since 2019.
15 months of tracking… means 2 years + 15 months + 63 weeks + 365 + 74 days, then 519 periods… I think the answer is not there (I would need a 400 factor to reach 200k)…
On my side, when I add a new segment my Matomo doesn’t do the same as yours
It’s a good time to put in a plug for the config – to keep both archiving and invalidating within some limits, to avoid surprises:
(If you have a common.config.ini.php then these settings may be applied there):
[General]
; Requests with a &segment= parameter will not trigger archiving.
; Ensures that no unexpected data processing triggers from UI or API.
browser_archiving_disabled_enforce = 1
; All new Segments created in the future will be set to:
; “Pre-processed (faster, requires cron core:archive command)”
enable_create_realtime_segments = 0
; By default we process a new segment’s reports from the
; beginning of time (“beginning_of_time”).
; When you have a lot of historical data, we recommend to
; process new segment’s reports from the segment’s creation time.
process_new_segments_from = “segment_creation_time”
; When processing the number of unique visitors across large datasets
; some performance issues may be experienced. In this case we would
; recommend to disable the Unique visitors metrics processing.
enable_processing_unique_visitors_day = 0
enable_processing_unique_visitors_week = 0
enable_processing_unique_visitors_month = 0
enable_processing_unique_visitors_year = 0
enable_processing_unique_visitors_range = 0
; Settings below ensure high performance archiving
; for Roll-ups and other sites
time_before_today_archive_considered_outdated = 10800
time_before_week_archive_considered_outdated = 43200
time_before_month_archive_considered_outdated = 43200
time_before_year_archive_considered_outdated = 64800
time_before_range_archive_considered_outdated = 43200
Next to this:
rearchive_reports_in_past_last_n_months 18
The only thing I want to mention is that I want the segment data since beginning of 2022 and not since “segment_creation_time”. Thanks why we set this to beginning.
Do you have any other ideas how to speed up the currently running processes? There are still a lot of processes to handle:
Hi @iparker
The number of invalidation reduced a few (25%)
I think there is some multiplying factor somewhere that we missed.
Are ou the only user of Matomo? Are you sure there is only 65 segments in total? (if other users defined some, then the real number can be higher! )
But I’m still unsafe if this happens again when I create a new segment, custom report or change a segment/custom report.
I think there is some multiplying factor somewhere that we missed.
Are ou the only user of Matomo? Are you sure there is only 65 segments in total? (if other users defined some, then the real number can be higher! :wink: )
Me too. Yes there are just 65 segments (or now: 67). The system summary says:
System Summary
11 users
67 segments
38 goals
35 custom reports
0 tracking failures
6 websites
55 activated plugins
7 containers (in tag manager)
Matomo version: 4.13.3
MySQL version: 8.0.18
PHP version: 7.3.13
Are these segments “shared” across websites? Then it could be the explanation: 67 segments * 6 websites = 402 * 519 periods = 208K invalidations to proceed…
Also I see some plugins. Do some of them “create” things to be invalidated?
Hi @MisterGenest, do you know if custom reports “generates” some data to invalidate (in case of data invalidation)? This could answer the question from @iparker…
Yes, if the invalidation happens, it would process all the report when the archive runs next, however it is possible to limit the archive based on the plugin for example:
I’m currently having exactly the same problem again. On top of that, we’re now testing the funnel plugin, which obviously also requires data recalculation when a funnel changes.
The situation is the same as last time: there are over 200k invalidations to be recalculated/created. The process has been dragging on for several days now and I have had to restart the MySQL service twice because of the “too much connected” problem.
Yes, I know that we have a lot of data (segments, reports, etc.) - but I’m still surprised that the recalculation takes so extremely long.
#### Total number of invalidations: 28115
#### Invalidation count in process: 134
#### Planned Number of Invalidations: 27981
#### Earliest invalidation ts_started: 2023-08-17 06:17:49
#### Last invalidation ts_started: 2023-08-18 06:18:28
#### Earliest invalidation ts_invalidated: 2023-08-13 15:06:23
#### Last invalidation ts_invalidated: 2023-08-18 06:05:41
#### Number of segment invalidations: 27780
#### Number of plugin invalidations: 27761
#### List of plugins to be invalidated: Funnel, CustomReports
It would be great to get help for an improvement here. This behavior makes it very difficult for us to use Matomo effectively.