I’m using Matomo (3.14.0) with Form Analytics (3.1.28). Both “delete old raw data” and “delete old aggregated report data” policies are in place (with core:archive cron job). Still, in practice, I see that Form Analytics purges records from the log_form db table only, leaving log_form_field and log_form_page tables filled with all the historical data from the beginning of time.
I can confirm that old records are correctly purged from log_form but not from log_form_page/log_form_field tables, matching them by idvisit/idlogform. How can I clean that old data and prevent its accumulation in the future?
Good afternoon.
We faced the same problem. The lack of the ability to automatically clear such data in the database greatly upsets us and forces us to perform additional manual actions to monitor free space and directly clean the data. For what reason is this data not cleared?
Thank you for the report! we have investigated and can confirm this is an issue. We’ve created a bug report in https://github.com/matomo-org/matomo/issues/16529 and you can follow progress here. It’s not currently a priority to fix this, but we’ll get to it next year.
Form analytics will not work on native apps. For a list of plugins that are not compatible with native apps please check the FAQ here: https://matomo.org/faq/plugins/faq_32898/
Albeit there is no fix available as of now, I managed to write these two stored procedures to cleanup log_form_field / log_form_page tables. The deletion is performed gracefully by chunks not to disrupt a running Matomo website. I’ve used it to wipe 100M accumulated “dead” records successfully on a live production instance. One can also set up a scheduler event to trigger the cleanup daily. Here it is:
DROP PROCEDURE IF EXISTS matomo_cleanup_lff;
DELIMITER $$
CREATE PROCEDURE matomo_cleanup_lff()
BEGIN
REPEAT
DO SLEEP(1);
DELETE lff1 FROM matomo_log_form_field AS lff1
JOIN
(SELECT
lff.idlogform, lff.field_name, lff.idformview
FROM
matomo_log_form_field lff
LEFT JOIN matomo_log_form lf ON lff.idlogform = lf.idlogform
WHERE
lf.idlogform IS NULL
LIMIT 20000) AS lff2 ON lff1.idlogform = lff2.idlogform
AND lff1.field_name = lff2.field_name
AND lff1.idformview = lff2.idformview;
UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;
DROP PROCEDURE IF EXISTS matomo_cleanup_lfp;
DELIMITER $$
CREATE PROCEDURE matomo_cleanup_lfp()
BEGIN
REPEAT
DO SLEEP(1);
DELETE lfp1 FROM matomo_log_form_page AS lfp1
JOIN
(SELECT
lfp.idlogformpage
FROM
matomo_log_form_page lfp
LEFT JOIN matomo_log_form lf ON lfp.idlogform = lf.idlogform
WHERE
lf.idlogform IS NULL
LIMIT 20000) AS lfp2 ON lfp1.idlogformpage = lfp2.idlogformpage;
UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;
DROP EVENT IF EXISTS matomo_form_cleanup;
DELIMITER $$
CREATE EVENT matomo_form_cleanup
ON SCHEDULE EVERY 1 DAY STARTS CURDATE() + INTERVAL 2 HOUR
DO BEGIN
CALL matomo_cleanup_lff();
CALL matomo_cleanup_lfp();
END$$
DELIMITER ;