Now the server had finished the reprocessing. For this I look at the matomo_archive_invalidations-table, also this information is visible in system check:
Total Invalidation Count: 114680
I was very surprised that this number goes from nearly zero to around 140.000. I don’t invalidated the historical reports as before but I created some new segments and also changend some existing custom reports (and unlocked the reports for this).
Is this really so much work for matomo/the server just by creating about 5 new segments and 5 new custom reports? Our server needs 3 till 5 days to handle this.
Is this normal or is the a problem/mistake I made? I do I missunderstand anything in this process?
[Debug]
; if set to 1, the archiving process will always be triggered, even if the archive has already been computed
; this is useful when making changes to the archiving code so we can force the archiving process
always_archive_data_period = 0;
always_archive_data_day = 0;
; Force archiving Custom date range (without re-archiving sub-periods used to process this date range)
always_archive_data_range = 0;
/.../
[General]
; When archiving segments for the first time, this determines the oldest date that will be archived.
; This option can be used to avoid archiving (for instance) the lastN years for every new segment.
; Valid option values include: "beginning_of_time" (start date of archiving will not be changed)
; "segment_last_edit_time" (start date of archiving will be the earliest last edit date found,
; if none is found, the created date is used)
; "segment_creation_time" (start date of archiving will be the creation date of the segment)
; editLastN where N is an integer (eg "editLast10" to archive for 10 days before the segment last edit date)
; lastN where N is an integer (eg "last10" to archive for 10 days before the segment creation date)
process_new_segments_from = "beginning_of_time"
The debug-options in configuration have all the value 0.
The value for “process_new_segments_from” is “beginning_of_time”.
Currently the system is again at the limit. I created 6 new custom reports and a new segment and this leads to more thank 200k “invalidation counts” which the system is handeling now very slowly…
Hope you can tell me and help me how to improve this!
I don’t know. Where can I see this? For the site we need the custom reports/segments for we track since beginning of 2022. But the matomo instance is older. For another page we track the data for more year. I think since 2019.
15 months of tracking… means 2 years + 15 months + 63 weeks + 365 + 74 days, then 519 periods… I think the answer is not there (I would need a 400 factor to reach 200k)…
On my side, when I add a new segment my Matomo doesn’t do the same as yours
It’s a good time to put in a plug for the config – to keep both archiving and invalidating within some limits, to avoid surprises:
(If you have a common.config.ini.php then these settings may be applied there):
[General]
; Requests with a &segment= parameter will not trigger archiving.
; Ensures that no unexpected data processing triggers from UI or API.
browser_archiving_disabled_enforce = 1
; All new Segments created in the future will be set to:
; “Pre-processed (faster, requires cron core:archive command)”
enable_create_realtime_segments = 0
; By default we process a new segment’s reports from the
; beginning of time (“beginning_of_time”).
; When you have a lot of historical data, we recommend to
; process new segment’s reports from the segment’s creation time.
process_new_segments_from = “segment_creation_time”
; When processing the number of unique visitors across large datasets
; some performance issues may be experienced. In this case we would
; recommend to disable the Unique visitors metrics processing.
enable_processing_unique_visitors_day = 0
enable_processing_unique_visitors_week = 0
enable_processing_unique_visitors_month = 0
enable_processing_unique_visitors_year = 0
enable_processing_unique_visitors_range = 0
; Settings below ensure high performance archiving
; for Roll-ups and other sites
time_before_today_archive_considered_outdated = 10800
time_before_week_archive_considered_outdated = 43200
time_before_month_archive_considered_outdated = 43200
time_before_year_archive_considered_outdated = 64800
time_before_range_archive_considered_outdated = 43200
Next to this:
rearchive_reports_in_past_last_n_months 18
The only thing I want to mention is that I want the segment data since beginning of 2022 and not since “segment_creation_time”. Thanks why we set this to beginning.
Do you have any other ideas how to speed up the currently running processes? There are still a lot of processes to handle: